SIO Interrupt storms and unhanded interrupts

From: Mike Tancsa <mike_at_sentex.net> Date: Wed, 08 Sep 2004 19:25:48 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:11 UTC

I think we have been bouncing around this issue for the past few months 
both on RELENG_4 and now RELENG_5.  In the past it has been somewhat 
difficult to reproduce, but now we can do it reliably.    I dont think its 
a hardware issue as I can take the exact same 2 boxes with the exact same 
IRQ assignments and boot with OpenBSD and not run into an interrupt storm 
or freeze up the box.  Swap back the RELENG_4 or RELENG_5 HD and again, I 
can produce an interrupt storm at will.

I can also reproduce it on 2 different chipsets as well (VIA and 
Intel).  The problem seems to be around how a PUC device (either a PCI 
modem or a PCI serial card) and the sharing of an interrupt (usually an USB 
controller, but not always).

On RELENG_4, the box just locks up in a race trying to service an interrupt 
on IRQ 12 but remains unhandled.

On RELENG_5, I actually catch an interrupt storm. e.g. I attach to sio4 
(PUC modem) and

Interrupt storm detected on "irq12: uhci1"; throttling interrupt source

Looking at vmstat -i does indeed show a the rate getting throttled

releng-5-pioneer# vmstat -i
interrupt                          total       rate
irq0: clk                         596719         99
irq1: atkbd0                           2          0
irq4: sio0                          1079          0
irq6: fdc0                             1          0
irq8: rtc                         763812        127
irq12: uhci1                        5825          0
irq13: npx0                            1          0
irq14: ata0                        38727          6
irq15: vr0 ata1                     1984          0
Total                            1408150        235
releng-5-pioneer#

where irq12 is the IRQ shared by the modem and the USB port.  However, 
because all IRQ 12s get throttled, the modem is unusable. e.g. trying to cu 
-l /dev/cuaa4 and typing atz takes about 5 seconds.

Is there some way to safely tell the kernel that the PUC device that its 
shareable ?  We did this perhaps very ugly hack on RELENG_4

_at__at_ -1431,15 +1431,19 _at__at_

         rid = 0;
         com->irqres = bus_alloc_resource(dev, SYS_RES_IRQ, &rid, 0ul, ~0ul, 1,
-           RF_ACTIVE);
+/*         RF_ACTIVE); */
+           RF_SHAREABLE);

to /usr/src/sys/isa/sio.c

and at least we can talk to the sio device.  However, on RELENG_5 there 
does not seem to be the same fix.

My question is this-- Is the root cause the same issue on RELENG_4 and 
RELENG_5 ?  Are we going about it the best way to fix the problem ? Or is 
the underlying problem something else ?

Attached is a dmesg and acpidump

         ---Mike

--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike_at_sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike