Re: Potential source of interrupt aliasing

From: Scott Long <scottl_at_samsco.org>
Date: Sun, 10 Apr 2005 23:53:33 -0600
Matthew Dillon wrote:
> :Or acknowledge the interrupt in the hardware before scheduling the
> :ithread via a routine provided by the driver.
> 
>     There are two things that need to be acknowledged.
> 
>     (1) The APIC needs to be EOI'd to clear the interrupt so the APIC
> 	can deliver the next interrupt.  If you don't do this, ALL 
> 	interrupt sources stop working.
> 
>     (2) The actual device is asserting a level interrupt.  Just EOI'ing
> 	the interrupt the APIC delivers does not turn off the interrupt.
> 	The APIC will see that the device is still asserting the interrupt
> 	and immediately generate another event.
> 
> 	The actual device has to deassert the interrupt, but this means
> 	that you generally have to process the events from the device
> 	to accomplish that.  You often can't process these events in the
> 	hard interrupt vector handling function (otherwise they'd simply
> 	be FAST interrupts, but since they aren't they need the interrupt
> 	thread's context to be properly processed).

Hogwash.  There is quite a bit of hardware that will allow you to
silence the interrupt source without actually processing the interrupt
data right away.  This is a key feature of the interrupt handling API
in Mac OSX, actually.  I even have the AAC driver operate in this
fashion for FreeBSD.

> 
> 	This means that you can't deassert the interrupt from the device
> 	source at the time you get the hardware interrupt.
> 
>     Since the device interrupt cannot be deasserted at the time the actual
>     interrupt occurs the only thing you can do is mask the interrupt in the
>     APIC so the APIC stops dispatching it.

I'm not sure if you've actually read and comprehended Doug's mail.  The
'boot interrupt', as documented in the Intel manuals, comes from masking
the real intpin in the APIC while the source is asserted.  This whole
email thread is about figuring out a way to service interrupts without
triggering this 'reroute-on-apic-mask-event' feature of the chipset.

> 
>     The interrupt thread is then responsible for unmasking the interrupt
>     in the APIC after it has finished processing the events from the
>     device(s) and presumably cleared the interrupt at its source.

This is How Things Work already.

> 
>     Even worse, every interrupting device manages its interrupt sources
>     differently.  There is no universal, generic way to clear a device
>     interrupt... only the actual device driver knows how to deal with it
>     and often that means actually processing the related device events
>     before the interrupt can be cleared.

See above.  You're making a blanket statement that is incorrect for most
PCI hardware.

> 
> :>     *BUT* it *IS* possible that the wrong APIC vector is being masked (and
> :>     not because of an interrupt alias, but because the actual hard interrupt
> :>     is misrouted).
> :
> :I don't think this is the case. Somehow the vector would have to get
> :corrupted during this function call, which is line 609 in
> :src/sys/i386/i386/local_apic.c:
> :
> :isrc = intr_lookup_source(apic_idt_to_irq(frame.if_vec));
> 
>     The vector is not being corrupted at all.  Just put that out of your
>     mind... the APIC is working just fine.  The problem is most likely
>     that the device is asserting the interrupt on the WRONG PIN.  Since
>     the wrong IRQ is asserted, the wrong APIC vector is dispatched, the
>     wrong interrupt handler and ithread is run, and the source from the
>     device that actually generated the interrupt is NOT cleared (because
>     it isn't the device that the system thinks generated the interrupt).
> 

No, you are absoutely wrong here.  I can speak from authority because
Doug is testing on hardware that sits in my basement.  When an interrupt
fires on em0 or ahd0, it shows up on the irq for both the actual source
and for irq16, which just happens to be uhci0.  There is no misrouting
at all, it's that the interrupt is showing up in two places.  With a
little detective work, Doug discovered that it shows up on irq16 only
after the real intpin has been masked in the APIC.  Again, please
re-read and comprehend his emails.

> :I would expect much wider aliasing or stray interrupt problems if this was
> :occuring.
> 
>     It's usually just one or possibly two devices that are mis-configured,
>     mainly because the BIOS confusion is typically limited to particular
>     devices.  It depends heavily on the motherboard, BIOS, what devices
>     are enabled in the BIOS, and what devices the BIOS itself needs (e.g.
>     for PXE booting, USB keyboard, booting, etc) to boot.
> 
> :I'm convinced these "misrouted interrupts" are sourcing from the boot
> :interrupt functionality.  You don't route interrupts in APIC mode; its a
> :flat space. All of the APIC entries stack together as if they were one
> :gigantic IOAPIC that every PCI device's INTx lines were attached to. This
> :is the System Interrupts model described in the ACPI specification.
> :
> :-- 
> :Doug White                    |  FreeBSD: The Power to Serve
> :dwhite_at_gumbysoft.com          |  www.FreeBSD.org
> 
>     You do route interrupts in APIC mode.  I wish it were a flat space!  It
>     isn't.
> 
>     I think you are forgetting a couple of things here:
> 
>     * PCI busses only have 4 interrupt lines (A, B, C, and D).
> 
>     * Motherboards often have anywhere from 3 to 6 PCI or PCI-like busses,
>       connected to the APICs via bridge chips.
> 
>     * The bridge chips have a limited number of IRQ pins.
> 
>     * Sometimes you have several bridges connected to another bridge
>       before it gets to the APIC.
> 
>     So the answer is... regardless of the capabilities of the APIC(s)
>     devices still often have limited choices that require IRQ sharing
>     simply due to the PCI BUS and BRIDGE configuration of the motherboard.
> 
>     But even more to the point, BIOSes (ACPI, etc.) often get really
>     confused about routing IRQs through bridges.  They will for example
>     believe that two devices that share a *PHYSICAL* IRQ line through a
>     bridge are capable of being assigned different IRQs when, in fact,
>     they aren't.  They will get confused about how some of the PCI IRQ
>     lines are routed to the bridges (so line 'B' on PCI bus #1 might be 
>     misconfigured, for example).  All sorts of bad things can happen.
> 
>     The only way for an operating system to figure this stuff out on its
>     own is to understand the umpteen different bridge chips out there,
>     test physical interrupt sources (which is not always possible) to see
>     how they are actually routed, and ignore the BIOS completely.

One of the design goals of the APIC and PIC routing is to make it honor
the user-selected routing that comes from the BIOS.  There are a number
of BIOSes that allow the user to chose which INTx line will be routed to
a particular PCI slot or embedded device.  It would be a big POLA
problem to ignore these hints and route purely based on what we think
is right.

The problem here is that certain Intel chipsets. and maybe others, have
a special compatibility mode for routing interrupts in a way that is
easy for Option ROM and DOS drivers to deal with.  It's called the 'boot
interrupt' according to the Intel docs.  We need to fully understand how
this works so we can deal with it correctly.  That is the point of the
thread.  We already know how the MP specs and ACPI specs work with
regard to traditional interrupt routing.

> 
>     Wasn't it something like NetBSD or OpenBSD that was thinking about
>     doing that?  Not trying to figure out the routing but instead just
>     figure out which vector was being asserted for a device?  I'm beginning
>     to think that that may be the ONLY solution.
> 
>     Intel really screwed up big time.  Motorola had a much, much, MUCH
>     better mechanism where the actual devices generated the actual vector
>     number on the interrupt bus and the only thing you might have hardwired
>     would have been the IPL.  But Intel doesn't work that way.  Their stuff
>     is just totally screwed when it comes to handling interrupts.  It's
>     completely 100% guarenteed pungent crapola to anyone who has ever
>     built hardware with a *REAL* interrupt subsystem.

See also: sbus(4), msi(4).

MSI is something that I'd like to work on, but simply had the time.
It's not a panacea since it will only work for MSI-enabled PCI devices,
but many peripherals found on these Intel systems fall into that
category.

Scott
Received on Mon Apr 11 2005 - 03:56:43 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:31 UTC