Re: [kern/68351] bge0 watchdog timeout on 5.2.1 and -current, 5.1 is ok

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Tue, 29 Jun 2004 11:58:19 -0400
On Monday 28 June 2004 01:32 pm, Vadim Mikhailov wrote:
> Hi,
>
> I have a Dell PowerEdge 1750 server with 2 Xeon 3.0 GHZ CPUs, 4 GB RAM and
> 2 onboard gigabit ethernet ports:
>
> bge0: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2002> mem
> 0xfcd20000-0xfcd2ffff,0xfcd30000-0xfcd3ffff irq 17 at device 0.0 on pci2
> bge1: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2002> mem
> 0xfcd00000-0xfcd0ffff,0xfcd10000-0xfcd1ffff irq 18 at device 0.1 on pci2
>
> Only bge0 is used, with jumbo frames (my gigabit switch PowerConnect 5224
> supports them):
>
> bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 9000
>     options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>
>     inet 172.xx.xx.xx netmask 0xfffff800 broadcast 172.xx.xx.255
>     ether 00:06:5b:ef:63:e6
>     media: Ethernet autoselect (1000baseTX <full-duplex>)
>     status: active
>
> This box has two dualport SCSI adapters:
>
> mpt0: <LSILogic 1030 Ultra4 Adapter> port 0xbc00-0xbcff mem
> 0xfcb20000-0xfcb2ffff,0xfcb30000-0xfcb3ffff irq 13 at device 5.0 on pci4
> mpt1: <LSILogic 1030 Ultra4 Adapter> port 0xb800-0xb8ff mem
> 0xfcb00000-0xfcb0ffff,0xfcb10000-0xfcb1ffff irq 16 at device 5.1 on pci4
> ahc0: <Adaptec 3960D Ultra160 SCSI adapter> port 0xdc00-0xdcff mem
> 0xfcf01000-0xfcf01fff irq 19 at device 4.0 on pci1
> ahc1: <Adaptec 3960D Ultra160 SCSI adapter> port 0xd800-0xd8ff mem
> 0xfcf00000-0xfcf00fff irq 20 at device 4.1 on pci1
>
> Each adapter has disks attached to them. Firmware on motherboard and all
> peripherial
> devices is upgraded to the very latest versions from Dell.
> This setup works more or less ok under FreeBSD 5.1-RELEASE-p8 (GENERIC
> kernel with SMP enabled),
> but once a month or two machine reboots under load, so I want to upgrade it
> to 5.2.1-RELEASE.
> But when I boot 5.2.1-RELEASE or later kernel (-current) on this box,
> network adapter locks up.
> I see these messages on console and in the logs:
>
> Jun 25 15:25:22 vortex kernel: bge0: watchdog timeout -- resetting
>
> If I do "ifconfig bge0 down up", network becomes available for few seconds
> and then
> machine is not pingable again. I ran "systat -v" and have noticed that ping
> stops
> working exactly when I see any interrupt coming to mpt or ahc (i.e. on any
> disk activity).
>
> One visible difference between 5.1 (where it works) and 5.2.1/current
> (where it doesn't)
> is that interrupts to PCI devices are getting assigned differently:
>
> IRQ map under 5.1: mpt0 13, mpt1 16, bge0 17, bge0 18, ahc0 19, ahc1 20,
>   and under 5.2.1: mpt0 18, mpt1 19, bge0 16, bge1 17, ahc0 20, ahc1 21.

The numbers mean different things under 5.1 and 5.2.1.  Can you try booting a 
kernel from a recent snapshot of current to see if current works better?  
There have been various APIC and ACPI fixes since 5.2.1.

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Tue Jun 29 2004 - 14:08:12 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:59 UTC