Re: msk watchdog timeout

From: Koen Martens <gmc_at_sonologic.nl>
Date: Wed, 15 Oct 2008 14:22:34 +0200
On Thu, Oct 04, 2007 at 10:13:48AM +0900, Pyun YongHyeon wrote:
> On Wed, Oct 03, 2007 at 01:31:32PM +0800, Kudo Chien wrote:
>  > > Thanks for testing. Would you sumbit a PR for the issue and assign
>  > > it to me? I'll let you know when I manage to find a clue.
>  > >
>  > > OK. I've submitted a PR at
>  > http://www.freebsd.org/cgi/query-pr.cgi?pr=116853.
>  > Thanks you.
>  > 
> 
> I've grabbed it. Thanks.

For what it's worth, i've been having instability issues with msk0
too. Included is a dmesg and pciconf output.

The problem occurs under load (rsyncing tens of gigabytes over
gigabit link for example). I tried configuring the switch port
down to 100MB, in the hopes that msk0 would be more stable. It
is, but it still goes down after a while with watchdog timeouts.

I am now running it with msi disabled, it appears it lasts longer
than before now. But judging by what others said on this subject
already, it might still go wrong after as much as a month.

Also, I've never had these problems when the machine was still
on 6.x with the myk driver. Only after I upgraded it this tuesday
to RELENG_7, trouble started.

This is a server that I need to put back into production. I could
give you some time on it before I do that, but that'd have to be 
*right now* so i guess that won't work out really.

I'll probably install a nic to be used instead of the built-in
yukon interface, to get back the required stability.

Best,

Koen



postel# dmesg
Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-PRERELEASE #0: Tue Oct 14 11:55:05 CEST 2008
    root_at_postel.issuecrawler.net:/usr/obj/usr/src/sys/GENERIC
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.00GHz (2992.52-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf4a  Stepping = 10
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x641d<SSE3,RSVD2,MON,DS_CPL,CNXT-ID,CX16,xTPR>
  AMD Features=0x20000000<LM>
  AMD Features2=0x1<LAHF>
real memory  = 3757965312 (3583 MB)
avail memory = 3673931776 (3503 MB)
ACPI APIC Table: <A M I  OEMAPIC >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
acpi0: <A M I OEMRSDT> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, dff00000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <unknown> at device 0.1 (no driver attached)
pci0: <base peripheral> at device 1.0 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
pci3: <ACPI PCI bus> on pcib3
3ware device driver for 9000 series storage controllers, version: 3.70.05.001
twa0: <3ware 9000 series Storage Controller> port 0xcc80-0xccbf mem 0xfa000000-0xfbffffff,0xfceff000-0xfcefffff irq 48 at device 1.0 on pci3
twa0: [ITHREAD]
twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-4LP, 4 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002
pcib4: <ACPI PCI-PCI bridge> irq 16 at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 5.0 on pci0
pci5: <ACPI PCI bus> on pcib5
mskc0: <Marvell Yukon 88E8050 Gigabit Ethernet> port 0xdc00-0xdcff mem 0xfcffc000-0xfcffffff irq 16 at device 0.0 on pci5
mskc0: Unexpected number of MSI messages : 0
msk0: <Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02> on mskc0
msk0: Ethernet address: 00:04:23:d1:4a:fd
miibus0: <MII bus> on msk0
e1000phy0: <Marvell 88E1111 Gigabit PHY> PHY 0 on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto
mskc0: [FILTER]
pcib6: <ACPI PCI-PCI bridge> irq 16 at device 6.0 on pci0
pci6: <ACPI PCI bus> on pcib6
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xb880-0xb89f irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xbc00-0xbc1f irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xbc80-0xbc9f irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
ehci0: <Intel 82801EB/R (ICH5) USB 2.0 controller> mem 0xfccfec00-0xfccfefff irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: <Intel 82801EB/R (ICH5) USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3
uhub3: 6 ports with 6 removable, self powered
umass0: <TOSHIBA TOSHIBA USB 3.5"-HDD, class 0/0, rev 2.00/1.03, addr 2> on uhub3
pcib7: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci7: <ACPI PCI bus> on pcib7
em0: <Intel(R) PRO/1000 Network Connection 6.9.5> port 0xec80-0xecbf mem 0xfebe0000-0xfebfffff irq 16 at device 4.0 on pci7
em0: [FILTER]
em0: Ethernet address: 00:04:23:d1:4a:fc
vgapci0: <VGA-compatible display> port 0xe800-0xe8ff mem 0xfd000000-0xfdffffff,0xfebdb000-0xfebdbfff irq 17 at device 12.0 on pci7
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio0: [FILTER]
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
sio1: [FILTER]
fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
cpu0: <ACPI CPU> on acpi0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
cpu1: <ACPI CPU> on acpi0
p4tcc1: <CPU Frequency Thermal Control> on cpu1
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xca7ff,0xca800-0xcb7ff,0xcb800-0xcc7ff,0xcc800-0xcdfff pnpid ORM0000 on isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
acd0: CDROM <CD-224E-N/1.AA> at ata0-master UDMA33
da0 at twa0 bus 0 target 0 lun 0
da0: <AMCC 9550SX-4LP DISK 3.04> Fixed Direct Access SCSI-3 device 
da0: 100.000MB/s transfers
da0: 2097151MB (4294967295 512 byte sectors: 255H 63S/T 267349C)
da1 at twa0 bus 0 target 0 lun 1
da1: <AMCC 9550SX-4LP DISK 3.04> Fixed Direct Access SCSI-3 device 
da1: 100.000MB/s transfers
da1: 763840MB (1564344321 512 byte sectors: 255H 63S/T 97375C)
SMP: AP CPU #1 Launched!
da2 at umass-sim0 bus 0 target 0 lun 0
da2: <WDC WD75 00AAKS-00RBA0 > Fixed Direct Access SCSI-2 device 
da2: 40.000MB/s transfers
da2: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C)
GEOM_LABEL: Label for provider da2s1 is msdosfs/USB-HDD.
Trying to mount root from ufs:/dev/da0s1a
em0: link state changed to UP
msk0: link state changed to UP
twa0: INFO: (0x04: 0x000C): Initialize started: unit=0



postel# vmstat -i
interrupt                          total       rate
irq14: ata0                           58          0
irq16: mskc0 em0+                3217531       1586
irq23: ehci0                         178          0
irq48: twa0                       448723        221
cpu0: timer                      4036589       1990
cpu1: timer                      4032605       1988
Total                           11735684       5786



postel# pciconf -lc
hostb0_at_pci0:0:0:0:      class=0x060000 card=0x34398086 chip=0x35908086 rev=0x0c hdr=0x00
    cap 09[40] = vendor (length 5) Intel cap 4 version 1
none0_at_pci0:0:0:1:       class=0xff0000 card=0x34398086 chip=0x35918086 rev=0x0c hdr=0x00
none1_at_pci0:0:1:0:       class=0x088000 card=0x34398086 chip=0x35948086 rev=0x0c hdr=0x00
    cap 05[b0] = MSI supports 2 messages 
pcib1_at_pci0:0:2:0:       class=0x060400 card=0x00000000 chip=0x35958086 rev=0x0c hdr=0x01
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 05[58] = MSI supports 2 messages 
    cap 10[64] = PCI-Express 1 root port
pcib4_at_pci0:0:4:0:       class=0x060400 card=0x00000000 chip=0x35978086 rev=0x0c hdr=0x01
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 05[58] = MSI supports 2 messages 
    cap 10[64] = PCI-Express 1 root port
pcib5_at_pci0:0:5:0:       class=0x060400 card=0x00000000 chip=0x35988086 rev=0x0c hdr=0x01
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 05[58] = MSI supports 2 messages 
    cap 10[64] = PCI-Express 1 root port
pcib6_at_pci0:0:6:0:       class=0x060400 card=0x00000000 chip=0x35998086 rev=0x0c hdr=0x01
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 05[58] = MSI supports 2 messages 
    cap 10[64] = PCI-Express 1 root port
uhci0_at_pci0:0:29:0:      class=0x0c0300 card=0x34398086 chip=0x24d28086 rev=0x02 hdr=0x00
uhci1_at_pci0:0:29:1:      class=0x0c0300 card=0x34398086 chip=0x24d48086 rev=0x02 hdr=0x00
uhci2_at_pci0:0:29:2:      class=0x0c0300 card=0x34398086 chip=0x24d78086 rev=0x02 hdr=0x00
ehci0_at_pci0:0:29:7:      class=0x0c0320 card=0x34398086 chip=0x24dd8086 rev=0x02 hdr=0x00
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
pcib7_at_pci0:0:30:0:      class=0x060400 card=0x00000000 chip=0x244e8086 rev=0xc2 hdr=0x01
isab0_at_pci0:0:31:0:      class=0x060100 card=0x00000000 chip=0x24d08086 rev=0x02 hdr=0x00
atapci0_at_pci0:0:31:1:    class=0x01018a card=0x34398086 chip=0x24db8086 rev=0x02 hdr=0x00
none2_at_pci0:0:31:3:      class=0x0c0500 card=0x34398086 chip=0x24d38086 rev=0x02 hdr=0x00
pcib2_at_pci0:1:0:0:       class=0x060400 card=0x00000000 chip=0x03298086 rev=0x09 hdr=0x01
    cap 10[44] = PCI-Express 1 PCI bridge
    cap 05[5c] = MSI supports 1 message, 64 bit 
    cap 01[6c] = powerspec 2  supports D0 D3  current D0
    cap 07[d8] = PCI-X bridge supports
ioapic0_at_pci0:1:0:1:     class=0x080020 card=0x34398086 chip=0x03268086 rev=0x09 hdr=0x00
    cap 10[44] = PCI-Express 1 endpoint
    cap 01[6c] = powerspec 2  supports D0 D3  current D0
pcib3_at_pci0:1:0:2:       class=0x060400 card=0x00000000 chip=0x032a8086 rev=0x09 hdr=0x01
    cap 10[44] = PCI-Express 1 PCI bridge
    cap 05[5c] = MSI supports 1 message, 64 bit 
    cap 01[6c] = powerspec 2  supports D0 D3  current D0
    cap 07[d8] = PCI-X bridge supports
ioapic1_at_pci0:1:0:3:     class=0x080020 card=0x34398086 chip=0x03278086 rev=0x09 hdr=0x00
    cap 10[44] = PCI-Express 1 endpoint
    cap 01[6c] = powerspec 2  supports D0 D3  current D0
twa0_at_pci0:3:1:0:        class=0x010400 card=0x100313c1 chip=0x100313c1 rev=0x00 hdr=0x00
    cap 07[e0] = PCI-X 64-bit supports 133MHz, 512 burst read, 3 split transactions
    cap 01[e8] = powerspec 2  supports D0 D1 D2 D3  current D0
    cap 05[f0] = MSI supports 8 messages, 64 bit 
mskc0_at_pci0:5:0:0:       class=0x020000 card=0x34398086 chip=0x436111ab rev=0x18 hdr=0x00
    cap 01[48] = powerspec 2  supports D0 D1 D2 D3  current D0
    cap 03[50] = VPD
    cap 05[5c] = MSI supports 2 messages, 64 bit 
    cap 10[e0] = PCI-Express 1 legacy endpoint
em0_at_pci0:7:4:0: class=0x020000 card=0x34398086 chip=0x10768086 rev=0x05 hdr=0x00
    cap 01[dc] = powerspec 2  supports D0 D3  current D0
    cap 07[e4] = PCI-X supports 2048 burst read, 1 split transaction
vgapci0_at_pci0:7:12:0:    class=0x030000 card=0x34398086 chip=0x47521002 rev=0x27 hdr=0x00
    cap 01[5c] = powerspec 2  supports D0 D1 D2 D3  current D0

-- 
K.F.J. Martens, Sonologic, http://www.sonologic.nl/
Databases, wiki-expertise, hosting, server- en infrabeheer.
Public PGP key: http://www.metro.cx/pubkey-gmc.asc
Received on Wed Oct 15 2008 - 11:02:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:36 UTC