iflib_timer hits hung label; never recovers

From: Eric van Gyzen <eric_at_vangyzen.net>
Date: Fri, 12 Oct 2018 10:01:33 -0500
My firewall is running head at r338402 (30 Aug).  It has three I211 NICs 
(PCI dev 0x1539).  About 24 hours ago, it said:

Oct 11 22:29:03 asbestos kernel: igb1: TX(1) desc avail = 42, pidx = 524
Oct 11 22:29:03 asbestos kernel: Link state changed to down
Oct 11 22:29:03 asbestos kernel: core: link state changed to DOWN

It keeps saying this periodically:

Oct 12 09:46:05 asbestos kernel: igb1: TX(1) desc avail = 1024, pidx = 0

$ dmesg | uniq -c
2455 igb1: TX(1) desc avail = 1024, pidx = 0

I can panic the box and get a vmcore, but what other information should 
I get before then?  I tried to attach kgdb to the running kernel, but it 
failed.  :(

I grabbed sysctl dev.igb.1 and dropped it here:

http://vangyzen.net/FreeBSD/igb.hang/

I haven't tried manually recovering with ifconfig because I want to 
diagnose why the driver couldn't do it automatically.  I imagine it's 
hard to test this code path.  :)

Eric
Received on Fri Oct 12 2018 - 13:01:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:18 UTC