:Or acknowledge the interrupt in the hardware before scheduling the :ithread via a routine provided by the driver. There are two things that need to be acknowledged. (1) The APIC needs to be EOI'd to clear the interrupt so the APIC can deliver the next interrupt. If you don't do this, ALL interrupt sources stop working. (2) The actual device is asserting a level interrupt. Just EOI'ing the interrupt the APIC delivers does not turn off the interrupt. The APIC will see that the device is still asserting the interrupt and immediately generate another event. The actual device has to deassert the interrupt, but this means that you generally have to process the events from the device to accomplish that. You often can't process these events in the hard interrupt vector handling function (otherwise they'd simply be FAST interrupts, but since they aren't they need the interrupt thread's context to be properly processed). This means that you can't deassert the interrupt from the device source at the time you get the hardware interrupt. Since the device interrupt cannot be deasserted at the time the actual interrupt occurs the only thing you can do is mask the interrupt in the APIC so the APIC stops dispatching it. The interrupt thread is then responsible for unmasking the interrupt in the APIC after it has finished processing the events from the device(s) and presumably cleared the interrupt at its source. Even worse, every interrupting device manages its interrupt sources differently. There is no universal, generic way to clear a device interrupt... only the actual device driver knows how to deal with it and often that means actually processing the related device events before the interrupt can be cleared. :> *BUT* it *IS* possible that the wrong APIC vector is being masked (and :> not because of an interrupt alias, but because the actual hard interrupt :> is misrouted). : :I don't think this is the case. Somehow the vector would have to get :corrupted during this function call, which is line 609 in :src/sys/i386/i386/local_apic.c: : :isrc = intr_lookup_source(apic_idt_to_irq(frame.if_vec)); The vector is not being corrupted at all. Just put that out of your mind... the APIC is working just fine. The problem is most likely that the device is asserting the interrupt on the WRONG PIN. Since the wrong IRQ is asserted, the wrong APIC vector is dispatched, the wrong interrupt handler and ithread is run, and the source from the device that actually generated the interrupt is NOT cleared (because it isn't the device that the system thinks generated the interrupt). :I would expect much wider aliasing or stray interrupt problems if this was :occuring. It's usually just one or possibly two devices that are mis-configured, mainly because the BIOS confusion is typically limited to particular devices. It depends heavily on the motherboard, BIOS, what devices are enabled in the BIOS, and what devices the BIOS itself needs (e.g. for PXE booting, USB keyboard, booting, etc) to boot. :I'm convinced these "misrouted interrupts" are sourcing from the boot :interrupt functionality. You don't route interrupts in APIC mode; its a :flat space. All of the APIC entries stack together as if they were one :gigantic IOAPIC that every PCI device's INTx lines were attached to. This :is the System Interrupts model described in the ACPI specification. : :-- :Doug White | FreeBSD: The Power to Serve :dwhite_at_gumbysoft.com | www.FreeBSD.org You do route interrupts in APIC mode. I wish it were a flat space! It isn't. I think you are forgetting a couple of things here: * PCI busses only have 4 interrupt lines (A, B, C, and D). * Motherboards often have anywhere from 3 to 6 PCI or PCI-like busses, connected to the APICs via bridge chips. * The bridge chips have a limited number of IRQ pins. * Sometimes you have several bridges connected to another bridge before it gets to the APIC. So the answer is... regardless of the capabilities of the APIC(s) devices still often have limited choices that require IRQ sharing simply due to the PCI BUS and BRIDGE configuration of the motherboard. But even more to the point, BIOSes (ACPI, etc.) often get really confused about routing IRQs through bridges. They will for example believe that two devices that share a *PHYSICAL* IRQ line through a bridge are capable of being assigned different IRQs when, in fact, they aren't. They will get confused about how some of the PCI IRQ lines are routed to the bridges (so line 'B' on PCI bus #1 might be misconfigured, for example). All sorts of bad things can happen. The only way for an operating system to figure this stuff out on its own is to understand the umpteen different bridge chips out there, test physical interrupt sources (which is not always possible) to see how they are actually routed, and ignore the BIOS completely. Wasn't it something like NetBSD or OpenBSD that was thinking about doing that? Not trying to figure out the routing but instead just figure out which vector was being asserted for a device? I'm beginning to think that that may be the ONLY solution. Intel really screwed up big time. Motorola had a much, much, MUCH better mechanism where the actual devices generated the actual vector number on the interrupt bus and the only thing you might have hardwired would have been the IPL. But Intel doesn't work that way. Their stuff is just totally screwed when it comes to handling interrupts. It's completely 100% guarenteed pungent crapola to anyone who has ever built hardware with a *REAL* interrupt subsystem. -Matt Matthew Dillon <dillon_at_backplane.com>Received on Mon Apr 11 2005 - 00:31:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:31 UTC