Re: Interrupt stom on cardbus device

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 2 Mar 2009 11:15:20 -0500
On Friday 27 February 2009 8:35:59 pm Robert Noland wrote:
> On Fri, 2009-02-27 at 15:02 -0500, John Baldwin wrote:
> > On Friday 27 February 2009 2:11:04 pm Robert Noland wrote:
> > > On Fri, 2009-02-27 at 14:03 -0500, John Baldwin wrote:
> > > > On Friday 27 February 2009 1:50:28 pm Robert Noland wrote:
> > > > > On Fri, 2009-02-27 at 12:08 -0500, John Baldwin wrote:
> > > > > > On Friday 27 February 2009 9:30:06 am Sergey G Nasonov wrote:
> > > > > > > Hello all,
> > > > > > > I have get an issue after recent kernel recompile.
> > > > > > > The problem appears after switch from X to text console and back 
to X11.
> > > > > > > After that vmstat -i show an  interrupt storm on cardbus device:
> > > > > > > 
> > > > > > > > vmstat -i
> > > > > > > interrupt                          total       rate
> > > > > > > irq1: atkbd0                        6483          3
> > > > > > > irq9: acpi0                         3236          1
> > > > > > > irq12: psm0                       347988        167
> > > > > > > irq14: ata0                        16431          7
> > > > > > > irq16: cbb0 uhci2+              13624982       6556
> > > > > > > irq20: uhci0                          14          0
> > > > > > > irq22: ehci0                           2          0
> > > > > > > cpu0: timer                      4154687       1999
> > > > > > > irq256: em0                        53736         25
> > > > > > > irq257: hdac0                       5797          2
> > > > > > > cpu1: timer                      4153683       1998
> > > > > > > irq258: vgapci0                   235585        113
> > > > > > > Total                           22602624      10877
> > > > > > > 
> > > > > > > I suppose that the issue related with the latest MSI interrupt 
> > > > > > > handler changes for intel graphics chipset. My laptop has 
i965GM.
> > > > > > > pciconf -lv:
> > > > > > > 
> > > > > > > vgapci0_at_pci0:0:2:0:     class=0x030000 card=0x20b517aa 
chip=0x2a028086 
> > > > > > > rev=0x0c hdr=0x00
> > > > > > >     vendor     = 'Intel Corporation'
> > > > > > >     device     = 'Mobile 965 Express Integrated Graphics 
Controller'
> > > > > > >     class      = display
> > > > > > >     subclass   = VGA
> > > > > > > 
> > > > > > > When I added my device to drm_msi_blacklist and recompile drm 
modules 
> > > > the 
> > > > > > > problem disappear.
> > > > > > > Is it possible to resolve this problem without moving the device 
to the 
> > > > > > > drm_msi_blacklist?
> > > > > > > I can test any patches or provide additional detail if it is 
required.  
> > > > > > > Thanks.
> > > > > > 
> > > > > > It seems the device is still interrupting on its INTx line perhaps 
in 
> > > > addition 
> > > > > > to the MSI interrupts.
> > > > > 
> > > > > Hrm, I did most all of that development on a 965gm.  When you VT 
switch,
> > > > > the irq handler gets uninstalled and reinstalled when you return to 
X.
> > > > > There was an eratta on the 965gm suggesting that msi didn't work 
right,
> > > > > but I was never able to produce the issue.  Intel was having major
> > > > > issues with this on linux and I finally convinced them to turn msi 
back
> > > > > on.  My irq handler and Eric's are very similar, so I'm not sure 
what
> > > > > could be going on here.
> > > > > 
> > > > > There is however an issue with vblanks that might be related.  Could 
you
> > > > > try http://people.freebsd.org/~rnoland/drm-move_vblank_init.patch 
and
> > > > > see if that helps?
> > > > 
> > > > In this case the issue isn't that MSI isn't working I think, but that 
the 
> > > > hardware is sending interrupts via both routes (MSI and INTx).  If 
that 
> > > > happens, then you will see an interrupt storm on the INTx line, but 
FreeBSD 
> > > > will only notice if another device is sharing the same IRQ line.  So 
if your 
> > > > test machine has vgapci0 on irq 22 and you have no other devices on 
IRQ 22, 
> > > > then the storm would go unnoticed.  This is most likely a chip bug 
(unless 
> > > > the driver has to explicitly disable INTx interrupts when using MSI).  
It 
> > > > would probably be a good idea to add a hw.drm.msi_enable tunable (or 
> > > > hw.drm.msi) that people can use to disable MSI perhaps.
> > > 
> > > Ok, I do have docs on the 965, so I'll look at this.  The linux version
> > > does not do this, unless the OS does it in the background somewhere.
> 
> Ok, so I looked over the 965 docs again and noticed PCIR_COMMAND bit 10.
> Then I pulled up the AMD docs on their PCIE cards and they also have
> this bit.  I made an test patch for just the i915 driver to ensure that
> this fixes the issue, but it seems like a more general fix is in order.
> I'm proposing to disable INTx when we setup MSI/MSIX interrupts.  I
> talked with scottl_at_ about this a bit last night and this seems like the
> right thing to do, or at least it shouldn't hurt much...
> 
> John, what do you think of the attached patch?

Looks good and is something that was on my low-priority todo list. :)

-- 
John Baldwin
Received on Mon Mar 02 2009 - 17:31:32 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:43 UTC