On Thursday 05 May 2011 22:22:15 Jack Vogel wrote: > On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken <Daan_at_vehosting.nl> wrote: > > Hi Peter, > > > > On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote: > > > On 2011-May-05 13:22:59 +0200, Daan Vreeken <Daan_at_vehosting.nl> wrote: > > > >Not yet. I'll reboot the machine later today when I have physical > > > > access to it to check the BIOS version. I'll keep you informed as > > > > soon as I get another storm going. > > > > > > Depending on the quality of your BIOS (competence of the vendor), you > > > might find that kenv(8) reports the BIOS version without needing a > > > reboot. > > > (Look at smbios.bios.* in the output). ... > > smbios.bios.version="0303 " ... > > Version "0402" is the latest and greatest, so it's time to upgrade. > > According > > to Asus it "Improves system stability", so let's see if this 'cures' IRQ > > 16. > > Cool, thanks for the update! Good luck. I've updated the BIOS and let the machine run for a couple of hours with MSI/MSIX enabled. After 3 hours of uptime I see the storm again. Here are the first couple of lines of output of "top -S" : last pid: 33218; load averages: 0.47, 0.35, 0.33 up 0+03:52:1016:42:52 317 processes: 6 running, 289 sleeping, 22 waiting CPU: 0.4% user, 0.0% nice, 0.5% system, 11.6% interrupt, 87.5% idle Mem: 280M Active, 176M Inact, 1797M Wired, 8572K Cache, 32M Buf, 5545M Free Swap: 500M Total, 500M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 4 171 ki31 0K 64K CPU0 0 893:17 351.95% idle 12 root 23 -80 - 0K 368K WAIT 2 18:37 50.39% intr One core is spending half it's time handling interrupts. /var/log/messages doesn't show any new message since the storm started. "vmstat -i" now shows : # vmstat -i interrupt total rate irq3: uart1 917384 63 --> irq16: ehci0 809547235 55608 irq23: ehci1 1751385 120 cpu0:timer 16380717 1125 irq256: em0:rx 0 1651907 113 irq257: em0:tx 0 1495708 102 irq258: em0:link 3 0 irq259: em1:rx 0 397227 27 irq260: em1:tx 0 257865 17 irq261: em1:link 6 0 irq262: re0 10549 0 irq263: ahci0 290926 19 cpu1:timer 1160008 79 cpu3:timer 763939 52 cpu2:timer 4120133 283 irq272: hdac0 819282 56 Total 839564274 57670 Apart from spending far too much time handling interrupts, the machine works fine, so I'll let it run in case anyone wants me to try something on it. As a next step to try to isolate the problem I could create a kernel with MSI/MSIX enabled, but with a modified 'em' driver so it doesn't try to attach the MSI/MSIX interrupts to see if the problem is really related to the network cards or not. If anyone has a better idea, I'm all ears :) Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380Received on Fri May 06 2011 - 13:02:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC