Re: Potential source of interrupt aliasing

From: Matthew Dillon <dillon_at_apollo.backplane.com> Date: Mon, 11 Apr 2005 00:31:55 -0700 (PDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:31 UTC

:...
:that I mentioned precisely because we don't mask the IOAPIC for fast
:handlers.  Unfortunetaly, moving the entire OS to this scheme is
:quite labor-intensive.  It would make just as much sense to implement
:MSI infrastructre and convert a number of drivers to that.  And again,
:Linux seems immune to this problem, so it's very intriguing to find out
:why.
:
:Scott

    kernel/io_apic.c line 1829ish (linux 2.6.9).  And the whole file in 
    general.

    It appears that they simply do not EOI the APIC when handling a
    level triggered interrupt until after the interrupt handler has
    run.  And indeed, that is what appears to happen.  It looks like they
    may still be vulnerable due to the way they shutdown an interrupt
    (but by them the device is presumably not asserting interrupts any more).
    But for normal interrupt operation they simply do not EOI the APIC.

    I love the last sentence of this comment... OMG!  They have got to be
    kidding.

[ comment from linux source ]:

/*
 * Level triggered interrupts can just be masked,
 * and shutting down and starting up the interrupt
 * is the same as enabling and disabling them -- except
 * with a startup need to return a "was pending" value.
 *
 * Level triggered interrupts are special because we
 * do not touch any IO-APIC register while handling
 * them. We ack the APIC in the end-IRQ handler, not
 * in the start-IRQ-handler. Protection against reentrance
 * from the same interrupt is still provided, both by the
 * generic IRQ layer and by the fact that an unacked local
 * APIC does not accept IRQs.
 */

    They also have a workaround for various errata which is even
    nastier... it's just after that comment.  It looks like they change 
    the trigger mode to edge triggered then change it back to level
    after the edge-trigger occurs when they detect the chip errata's
    condition.

					-Matt
					Matthew Dillon 
					<dillon_at_backplane.com>

[ more comments from the linux code ]:
/*
 * It appears there is an erratum which affects at least version 0x11
 * of I/O APIC (that's the 82093AA and cores integrated into various
 * chipsets).  Under certain conditions a level-triggered interrupt is
 * erroneously delivered as edge-triggered one but the respective IRR
 * bit gets set nevertheless.  As a result the I/O unit expects an EOI
 * message but it will never arrive and further interrupts are blocked
 * from the source.  The exact reason is so far unknown, but the
 * phenomenon was observed when two consecutive interrupt requests
 * from a given source get delivered to the same CPU and the source is
 * temporarily disabled in between.
 *
 * A workaround is to simulate an EOI message manually.  We achieve it
 * by setting the trigger mode to edge and then to level when the edge
 * trigger mode gets detected in the TMR of a local APIC for a
 * level-triggered interrupt.  We mask the source for the time of the
 * operation to prevent an edge-triggered interrupt escaping meanwhile.
 * The idea is from Manfred Spraul.  --macro
 */