Re: Interrupt storm with MSI in combination with em1

From: Jack Vogel <jfvogel_at_gmail.com>
Date: Wed, 4 May 2011 15:15:43 -0700
This all looks completely kosher,  what IRQ is the storm on??

Jack


On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken <Daan_at_vehosting.nl> wrote:

> Hi,
>
> On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
> > Will you please set it back to a default and then boot and capture the
> > message for me?
>
> No problem. Here's the output with MSI/MSIX enabled :
>
> http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
>
> I've also added the output of "vmstat -i" a couple of minutes after a
> reboot
> with MSI enabled :
>        http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
>
> Note that in the above "vmstat -i" dump the interrupt storm hasn't started
> yet. For some reason the storm doesn't always start directly at boot. I
> haven't been able (yet) to pinpoint what's triggering it to start.
>
>
> > On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken <Daan_at_vehosting.nl> wrote:
> > > Hi Jack,
> > >
> > > Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
> > > > Who makes your motherboard? The problem you are having is that MSIX
> AND
> > > > MSI are both failing as em0 comes up, so it falls back to Legacy
> > >
> > > interrupt
> > >
> > > > mode,
> > > > and must be having some issue with sharing the line, causing the
> storm.
> > >
> > > The motherboard is an Asus "P7H55-M".
> > >
> > > Sorry, I should have mentioned that the dmesg output is from booting
> with :
> > > > >        hw.pci.enable_msix="0"
> > > > >        hw.pci.enable_msi="0"
> > >
> > > .. in "loader.conf".
> > >
> > > With those lines in "loader.conf", MSI and MSIX is disabled, both cards
> > > work
> > > like they should and there is no interrupt storm.
> > >
> > > With MSI/MSIX enabled, both cards work like they should and I see the
> > > counters
> > > of the MSI interrupts increase (in small amounts, like they should),
> but
> > > at boot-time an interrupt storm starts on 'legacy' IRQ 16.
> > >
> > > Because the only difference between disabling/enabling MSI/MSIX seems
> to
> > > be in
> > > the way em0/em1 are used, and because 'em1' shares IRQ 16 according to
> > > the dmesg, I'm suspecting 'em1' is causing the storm.
> > > (But please correct me if I'm wrong :)
> > >
> > > What can I do to help track this problem down?
> > >
> > > > > According to "dmesg" the following devices share IRQ 16 :
> > > > >
> > > > >        pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
> > > > >        em0: <Intel(R) PRO/1000 Network Connection 7.2.3> port
> > > > > 0xcc00-0xcc1f mem
> > > > > 0xf7de0000-0xf7dfffff,0xf7d00000-0xf7d7ffff,0xf7ddc000-0xf7ddffff
> > > > >           irq 16 at device 0.0 on pci1
> > > > >        vgapci0: <VGA-compatible display> port 0xbc00-0xbc07
> > > > >           mem 0xf7800000-0xf7bfffff,0xe0000000-0xefffffff irq 16 at
> > > > > device 2.0 on
> > > > >           pci0
> > > > >        ehci0: <Intel PCH USB 2.0 controller USB-B> mem
> > > > > 0xf7cfa000-0xf7cfa3ff
> > > > >           irq 16 at device 26.0 on pci0
> > > > >        em1: <Intel(R) PRO/1000 Network Connection 7.2.3> port
> > > > > 0xec00-0xec1f mem
> > > > > 0xf7fe0000-0xf7ffffff,0xf7f00000-0xf7f7ffff,0xf7fdc000-0xf7fdffff
> > > > >           irq 16 at device 0.0 on pci4
> > > > >        pcib4: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on pci0
> > > > >
> > > > > During a storm "vmstat -i" shows a rate of about 220.000
> > >
> > > interrupts/sec.
> > >
> > > > > MSI
> > > > > interrupt delivery to both 'em0' and 'em1' seems to work correctly
> > >
> > > during
> > >
> > > > > a storm, as I see their counters increase normally in the "vmstat
> -i"
> > > > > output.
> > > > >
> > > > > As only 'em0' and 'em1' seem to be using MSI interrupts, my guess
> is
> > >
> > > that
> > >
> > > > > the
> > > > > e1000 driver is causing this problem. Could it be that the driver
> > >
> > > forgets
> > >
> > > > > to
> > > > > clear/mask legacy interrupts when attaching the MSI interrupts
> > > > > perhaps?
> > > > >
> > > > > Any tips on how to debug and/or fix this?
> > > > >
> > > > >
> > > > > The full output of "dmesg" can be found here :
> > > > >        http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
> > > > >
> > > > > And the full output of "pciconf -lv" is here :
> > > > >
> http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
> > >
> > > Regards,
> > > --
> > > Daan Vreeken
> > > VEHosting
> > > http://VEHosting.nl
> > > tel: +31-(0)40-7113050 / +31-(0)6-46210825
> > > KvK nr: 17174380
>
>
> Regards,
> --
> Daan Vreeken
> VEHosting
> http://VEHosting.nl
> tel: +31-(0)40-7113050 / +31-(0)6-46210825
> KvK nr: 17174380
>
Received on Wed May 04 2011 - 20:15:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC