patch for "Previous IPI is stuck" - please test

From: Stephan Uphoff <ups_at_tree.com>
Date: Mon, 15 Nov 2004 16:27:00 -0500
There have been several complains about "Previous IPI is stuck" panics
on i386 based multiprocessor systems. In general all affected systems
seem to have four or more real (no HTT) processors.

Probable cause:
The local APIC used for IPIs can only queue two interrupts per interrupt
priority class (interrupt vector / 16). Since all IPIs share the same
interrupt priority class more than two IPIs pending to the same
processor will fill the interrupt fifo for the IPI priority class.  
I believe the "Previous IPI is stuck" is a deadlock between sending an
AST IPI with sched lock held to a CPU trying to acquire the sched lock
with interrupt disabled and with full interrupt fifo.

Unfortunately I can not reproduce the problem on my dual Xeon with HTT
enabled :-(

To test the theory I wrote a patch that replaces multiple IPI interrupt
handlers with a single hander and uses a bitmap to avoid redundant IPI
interrupt requests to the interrupt fifo.

The patch is a proof of concept and therefore not optimized (to put it
mildly ;-). It probably increases the cost for TLB shootdown IPIs
substantially.

You can download the patch at:
	http://people.freebsd.org/~ups/ipi4_patch
Please make sure that exception.o is rebuild.
( Makefile seems to miss the dependency to apic_vector.s) 

Any feedback is appreciated.

	Stephan
Received on Mon Nov 15 2004 - 20:27:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:22 UTC