Re: 7-CURRENT-SNAP009-i386-bootonly.iso on Shuttle XPC w/ AMD X2 (was Re: Side note on Shuttle XPC)

From: Matthew Dillon <dillon_at_apollo.backplane.com>
Date: Sun, 20 Nov 2005 00:06:28 -0800 (PST)
:...
:> a spurious ICU interrupt.  I have part of peter's hack expanded to do a full 
:> reset of the ICUs, and I'll update it for Monday to adjust the base interrupt 
:> such that the spurious ICU vectors get sent to the APIC spurious interrupt 
:> vector.  That should fix your issue as well as the same issue reported by 
:> someone else on the amd64_at_ list recently.
:> 
:
:Does this imply that the 'correct' fix involves catching the stray ICU 
:interrupt via a trap handler?  How often do these interrupts happen,
:and therefore what is the performance consequence to having to handle
:them?
:
:Scott

    I think John has the right fix in mind.  You have to catch the stray
    interrupt vector for every interrupt controller in the system.  This
    means the 8259 stray vector AND the LAPIC stray vector, even if one or
    both devices is completely disabled.

    Whether this represents a performance problem depends on the situation.
    If any interrupts are routed through the 8259 at all then the BIOS
    misprogramming bug I mentioned earlier will result in each real 
    interrupt also causing a stray interrupt (due to the double INT A cycle).
    Clearly this is not desirable.  If the 8259 is 100% disabled I think
    the duplicate stray interrupts will go away.

    Even under perfect conditions a stray interrupt can occur during
    programming or reprogramming of the 8259.  This would not cause a
    performance issue, just result in an occassional stray.  For example,
    if the 8259 issues an IRQ to the cpu and the IRQ source is masked
    while the cpu is doing an INT A cycle, the 8259 will return the stray
    interrupt vector.

    With regards to the LAPICs the story is slightly better.  The stray
    interrupt vector can be programmed into the LAPIC and the interrupt
    service routine basically doesn't have to do a thing, not even EOI.
    A stray LAPIC interrupt can occur in a number of situations but I
    do not believe any of them would result in the same braindamage that
    you get from broken 8259 routing.   One example of stray generation here
    would be if you changed the TPR while the LAPIC is responding to the
    cpu's INT A cycle.

    One thing this does imply is that we should never, ever overlap the
    8259 interrupt vector space with the LAPIC vector space.  I wonder if
    the LAPIC EOI lockup issue might be explained by an 8259 returning its
    stray vector that is misinterpreted as an LAPIC interrupt.  Since there
    is no way to determine what IRQ an LAPIC EOI is actually servicing
    (except by checking the ISR to see what bit actually got cleared), any
    sort of misinterpretation will result in disaster.  That means I have
    some work to do in DragonFly which is still using the separate FAST/SLOW
    vector code with the LAPIC 'SLOW' interrupts overlapping the 8259
    vector space.

    The 8259's stray interrupt vector is BASE+7 (usually 0x20 + 7).  I
    suspect that BASE+15 might also occur sometimes.  The only way to
    completely avoid getting stray 8259 vectors would be to *NEVER* mess
    with the interrupt masks.  I don't think that CLI/STI would work here,
    the INT A cycle is almost guarenteed to be decoupled from the 
    instruction stream.  In fact, at least on the AMD, the hypertransport 
    layer will do the cycle and queue a pending vector until it can be
    delivered to the cpu (from my read).

    That is clearly a problem since we pretty much have to mess with the
    masks to deal with level interrupt sources.  Or to disable the 8259
    completely, which is the solution John mentioned to me.

					-Matt
					Matthew Dillon 
					<dillon_at_backplane.com>
Received on Sun Nov 20 2005 - 07:06:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:48 UTC