On Tue, 4 Nov 2003, John Baldwin wrote: JB> JB>On 04-Nov-2003 Harti Brandt wrote: JB>> On Tue, 4 Nov 2003, Harti Brandt wrote: JB>> JB>> HB>On Tue, 4 Nov 2003, John Baldwin wrote: JB>> HB> JB>> HB>JB> JB>> HB>JB>On 04-Nov-2003 Harti Brandt wrote: JB>> HB>JB>> JB>> HB>JB>> Hi, JB>> HB>JB>> JB>> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=10000. This JB>> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot JB>> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double JB>> HB>JB>> fault. I suspect a race condition in the interrupt handling. My config JB>> HB>JB>> file has JB>> HB>JB>> JB>> HB>JB>> options SMP JB>> HB>JB>> device apic JB>> HB>JB>> options HZ=1000 JB>> HB>JB> JB>> HB>JB>Ok, I can try to reproduce. JB>> HB>JB> JB>> HB>JB>> Device configuration finished. JB>> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100 JB>> HB>JB>> Timecounters cpuid = 0; apic id = 00 JB>> HB>JB>> instruction pointer = 0x8:0xc048995d JB>> HB>JB>> stack pointer = 0x10:0xc0821bf4 JB>> HB>JB>> frame pointer cpuid = 0; apic id = 00 JB>> HB>JB>> JB>> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from JB>> HB>JB>> cpu_critical_exit. JB>> HB>JB> JB>> HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt. JB>> HB>JB>It might be helpful to figure what type of fault you are actually getting. JB>> HB> JB>> HB>tf_err is 0, tf_trapno is 30 (decimal). JB>> JB>> More information: JB>> JB>> I have replaced all the reserved vectors with individual ones, that set JB>> tf_err to the index (vector number). It appears the the vector number is JB>> 39 decimal. What does that mean? JB> JB>IRQ 7. JB>Can you post a verbose dmesg? Also, can you try both with and without JB>ACPI? Attached are both dmesgs. More datapoints: I had the parallel port (irq7) and the second sio disabled in the BIOS. After enabling both I now get a panic in lapic_handle_intr: Couldn't get vector from ISR! After fetching the relevant docs from intel I checked the registers of the apic pointed to by lapic. The interrupt taken is Xapic_irq1. isr1 is zero, but irr1 is 0x100 (that was without ACPI). How may that happen? As I understand ISR are the interrupts that have been delivered to the CPU so if it is interrupted a bit should be set, correct? I then have replaced the panic by a printf() followed by a return. Now the system comes to live, but I get a couple of these warnings. When the system is idle everyting seems fine, but when I start my simulation application (which normally generates between 20k and 250k interrupts/sec depending on the MPSAFE setting of the ATM drivers) I get approx 1-2 of these messages per second (this is with HZ=1000). A question while reading the code: what does the global lapic variable refer to? As I understand every CPU has its local APIC. Does it point to one of those two? To which? Regards, harti -- harti brandt, http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private brandt_at_fokus.fraunhofer.de, harti_at_freebsd.org
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:27 UTC