Re: NFS corruption on p4 machines (please test)

From: Lars Eggert <larse_at_ISI.EDU>
Date: Fri, 03 Oct 2003 11:55:42 -0700
Kris Kennaway wrote:

> On Fri, Oct 03, 2003 at 10:10:20AM -0700, Lars Eggert wrote:
> 
>>Kris,
>>
>>Kris Kennaway wrote:
>>
>>
>>>For some months now I have been experiencing NFS corruption on the
>>>three machines in the dosirak.kr package cluster - these are SMP
>>>pentium 4 machines that run -CURRENT.  Setting DISABLE_PSE and
>>>DISABLE_PG_G does not fix these problems.  I am able to easily
>>>reproduce these problems using /usr/src/tools/regression/fsx on a
>>>loopback nfs mount - they are not deterministic, but it blows up
>>>within about 8000 operations (less than a minute of operation).  In
>>>fact sometimes it even manages to make fsx segfault, which is fairly
>>>impressive :)
>>>
>>>Just mount something rw via loopback nfs, and run 'fsx foo' on the nfs
>>>filesystem for a few minutes.
>>
>>I just ran an fsx cycle on my desktop machine over a TCP mount, and it
>>seemed to work fine:
> 
> 
> Thanks.  What hardware specs?

Attached.

Lars
-- 
Lars Eggert <larse_at_isi.edu>           USC Information Sciences Institute


cam: using minimum scsi_delay (100ms)
Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 5.1-CURRENT #0: Tue Sep 30 10:11:59 PDT 2003
    root_at_nik.isi.edu:/usr/obj/usr/src/sys/KERNEL-1.31
Preloaded elf kernel "/boot/kernel/kernel" at 0xc06ed000.
Preloaded elf module "/boot/kernel/vesa.ko" at 0xc06ed21c.
Preloaded elf module "/boot/kernel/md.ko" at 0xc06ed2c8.
Preloaded elf module "/boot/kernel/linux.ko" at 0xc06ed370.
Preloaded elf module "/boot/kernel/if_gif.ko" at 0xc06ed41c.
Preloaded elf module "/boot/kernel/if_tun.ko" at 0xc06ed4c8.
Preloaded elf module "/boot/kernel/ipfw.ko" at 0xc06ed574.
Preloaded elf module "/boot/kernel/if_an.ko" at 0xc06ed620.
Preloaded elf module "/boot/kernel/wlan.ko" at 0xc06ed6cc.
Preloaded elf module "/boot/kernel/rc4.ko" at 0xc06ed778.
Preloaded elf module "/boot/kernel/pccard.ko" at 0xc06ed820.
Preloaded elf module "/boot/kernel/if_em.ko" at 0xc06ed8cc.
Preloaded elf module "/boot/kernel/if_fxp.ko" at 0xc06ed978.
Preloaded elf module "/boot/kernel/miibus.ko" at 0xc06eda24.
Preloaded elf module "/boot/kernel/if_lnc.ko" at 0xc06edad0.
Preloaded elf module "/boot/kernel/if_wi.ko" at 0xc06edb7c.
Preloaded elf module "/boot/kernel/if_xl.ko" at 0xc06edc28.
Preloaded elf module "/boot/kernel/snd_emu10k1.ko" at 0xc06edcd4.
Preloaded elf module "/boot/kernel/snd_pcm.ko" at 0xc06edd84.
Preloaded elf module "/boot/kernel/snd_es137x.ko" at 0xc06ede30.
Preloaded elf module "/boot/kernel/snd_ich.ko" at 0xc06edee0.
Preloaded elf module "/boot/kernel/snd_maestro3.ko" at 0xc06edf8c.
Preloaded elf module "/boot/kernel/ugen.ko" at 0xc06ee040.
Preloaded elf module "/boot/kernel/usb.ko" at 0xc06ee0ec.
Preloaded elf module "/boot/kernel/uhid.ko" at 0xc06ee194.
Preloaded elf module "/boot/kernel/ukbd.ko" at 0xc06ee240.
Preloaded elf module "/boot/kernel/ulpt.ko" at 0xc06ee2ec.
Preloaded elf module "/boot/kernel/ums.ko" at 0xc06ee398.
Preloaded elf module "/boot/kernel/umass.ko" at 0xc06ee440.
Preloaded elf module "/boot/kernel/umodem.ko" at 0xc06ee4ec.
Preloaded elf module "/boot/kernel/ucom.ko" at 0xc06ee598.
Preloaded elf module "/boot/kernel/bktr.ko" at 0xc06ee644.
Preloaded elf module "/boot/kernel/bktr_mem.ko" at 0xc06ee6f0.
Preloaded elf module "/boot/kernel/agp.ko" at 0xc06ee7a0.
Preloaded elf module "/boot/kernel/random.ko" at 0xc06ee848.
Preloaded elf module "/boot/kernel/ip_mroute.ko" at 0xc06ee8f4.
Preloaded elf module "/boot/kernel/ip6fw.ko" at 0xc06ee9a4.
Preloaded elf module "/boot/kernel/netgraph.ko" at 0xc06eea50.
Preloaded elf module "/boot/kernel/dummynet.ko" at 0xc06eeb00.
Preloaded elf module "/boot/kernel/radeon.ko" at 0xc06eebb0.
Preloaded elf module "/boot/kernel/r128.ko" at 0xc06eec5c.
Preloaded elf module "/boot/kernel/ahc.ko" at 0xc06eed08.
Preloaded elf module "/boot/kernel/mpt.ko" at 0xc06eedb0.
Preloaded elf module "/boot/kernel/fdc.ko" at 0xc06eee58.
Preloaded elf module "/boot/kernel/cbb.ko" at 0xc06eef00.
Preloaded elf module "/boot/kernel/exca.ko" at 0xc06eefa8.
Preloaded elf module "/boot/kernel/cardbus.ko" at 0xc06ef054.
Preloaded elf module "/boot/kernel/lpt.ko" at 0xc06ef100.
Preloaded elf module "/boot/kernel/ubsa.ko" at 0xc06ef1a8.
Preloaded elf module "/boot/kernel/firewire.ko" at 0xc06ef254.
Preloaded elf module "/boot/kernel/sbp.ko" at 0xc06ef304.
Preloaded elf module "/boot/kernel/smbus.ko" at 0xc06ef3ac.
Preloaded elf module "/boot/kernel/intpm.ko" at 0xc06ef458.
Preloaded elf module "/boot/kernel/smb.ko" at 0xc06ef504.
Preloaded elf module "/boot/kernel/iicbus.ko" at 0xc06ef5ac.
Preloaded elf module "/boot/kernel/iic.ko" at 0xc06ef658.
Preloaded elf module "/boot/kernel/iicsmb.ko" at 0xc06ef700.
Preloaded elf module "/boot/kernel/uart.ko" at 0xc06ef7ac.
Preloaded elf module "/boot/kernel/acpi.ko" at 0xc06ef858.
Timecounter "i8254" frequency 1193121 Hz quality 0
CPU: Intel(R) XEON(TM) CPU 2.40GHz (2372.81-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf24  Stepping = 4
  Features=0x3febfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM>
  Hyperthreading: 2 logical CPUs
real memory  = 1073180672 (1023 MB)
avail memory = 1034924032 (986 MB)
Changing APIC ID for IO APIC #0 from 0 to 4 on chip
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00050014, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00050014, at 0xfee00000
 cpu2 (AP):  apic id:  2, version: 0x00050014, at 0xfee00000
 cpu3 (AP):  apic id:  3, version: 0x00050014, at 0xfee00000
 io0 (APIC): apic id:  4, version: 0x00178020, at 0xfec00000
bktr_mem: memory holder loaded
Pentium Pro MTRR support enabled
VESA: v2.0, 65536k memory, flags:0x1, mode table:0xc0410a22 (1000022)
VESA: ATI RADEON RV250
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <DELL   WS 530 > on motherboard
pcibios: BIOS version 2.10
Using $PIR table, 12 entries at 0xc00fb9a0
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_cpu0: <CPU> on acpi0
acpi_cpu1: <CPU> on acpi0
acpi_cpu2: <CPU> on acpi0
acpi_cpu3: <CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
IOAPIC #0 intpin 19 -> irq 2
IOAPIC #0 intpin 17 -> irq 13
IOAPIC #0 intpin 23 -> irq 16
agp0: <Intel 82860 host to AGP bridge> mem 0xe8000000-0xefffffff at device 0.0 on pci0
pcib1: <PCIBIOS PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
IOAPIC #0 intpin 16 -> irq 17
drm0: <ATI Radeon If R250 9000> port 0xec00-0xecff mem 0xff8f0000-0xff8fffff,0xf4000000-0xf7ffffff irq 17 at device 0.0 on pci1
info: [drm] AGP at 0xe8000000 128MB
info: [drm] Initialized radeon 1.9.0 20020828 on minor 0
pci1: <display> at device 0.1 (no driver attached)
pcib2: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 31.0 on pci2
pci3: <ACPI PCI bus> on pcib3
IOAPIC #0 intpin 20 -> irq 18
IOAPIC #0 intpin 21 -> irq 19
pci3: <base peripheral, interrupt controller> at device 0.0 (no driver attached)
mpt0: <LSILogic 1030 Ultra4 Adapter> port 0xdc00-0xdcff mem 0xff6a0000-0xff6bffff,0xff6c0000-0xff6dffff irq 18 at device 12.0 on pci3
mpt1: <LSILogic 1030 Ultra4 Adapter> port 0xd800-0xd8ff mem 0xff660000-0xff67ffff,0xff680000-0xff69ffff irq 19 at device 12.1 on pci3
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.16> mem 0xff6e0000-0xff6effff,0xff640000-0xff65ffff irq 19 at device 13.0 on pci3
em0:  Speed:1000 Mbps  Duplex:Full
pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci4: <ACPI PCI bus> on pcib4
IOAPIC #0 intpin 18 -> irq 20
xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xcc80-0xccff mem 0xff3ffc00-0xff3ffc7f irq 16 at device 11.0 on pci4
xl0: Ethernet address: 00:06:5b:bd:ee:48
miibus0: <MII bus> on xl0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fwohci0: <Texas Instruments TSB12LV26> mem 0xff3f8000-0xff3fbfff,0xff3ff000-0xff3ff7ff irq 17 at device 12.0 on pci4
fwohci0: [MPSAFE]
fwohci0: OHCI version 1.0 (ROM=0)
fwohci0: No. of Isochronous channel is 4.
fwohci0: EUI64 80:5b:06:00:48:ee:bd:00
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
sbp0: <SBP2/SCSI over firewire> on firewire0
fwohci0: Initiate bus reset
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
em1: <Intel(R) PRO/1000 Network Connection, Version - 1.7.16> port 0xcc40-0xcc7f mem 0xff3a0000-0xff3bffff,0xff3c0000-0xff3dffff irq 13 at device 13.0 on pci4
em1:  Speed:N/A  Duplex:N/A
pcm0: <Creative EMU10K1> port 0xcc20-0xcc3f irq 20 at device 14.0 on pci4
pcm0: <Cirrus Logic CS4297A AC97 Codec>
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH2 UDMA100 controller> port 0xffa0-0xffaf at device 31.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
uhci0: <Intel 82801BA/BAM (ICH2) USB controller USB-A> port 0xff80-0xff9f irq 2 at device 31.2 on pci0
usb0: <Intel 82801BA/BAM (ICH2) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1: NMB Dell USB Keyboard Hub, class 9/0, rev 1.10/0.01, addr 2
uhub1: 3 ports with 2 removable, bus powered
ukbd0: NMB Dell USB 7HK Keyboard, rev 1.10/0.01, addr 3, iclass 3/1
kbd0 at ukbd0
ums0: Logitech USB Receiver, rev 1.10/9.10, addr 4, iclass 3/1
ums0: 5 buttons and Z dir.
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
uhci1: <Intel 82801BA/BAM (ICH2) USB controller USB-B> port 0xff60-0xff7f irq 16 at device 31.4 on pci0
usb1: <Intel 82801BA/BAM (ICH2) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 on acpi0
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
ppc0 port 0x778-0x77f,0x378-0x37f irq 7 drq 1 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
orm0: <Option ROMs> at iomem 0xd2800-0xd3fff,0xd1000-0xd27ff,0xcd000-0xd0fff,0xc0000-0xccfff on isa0
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2

Timecounters tick every 10.000 msec
ipfw2 initialized, divert disabled, rule-based forwarding enabled, default to deny, logging disabled
IPv6 packet filtering initialized, logging disabled
IPsec: Initialized Security Association Processing.
GEOM: create disk ad0 dp=0xc64a8770
ad0: 78167MB <Maxtor 4D080H4> [158816/16/63] at ata0-master UDMA100
acd0: CDRW <PHILIPS DVD+RW-D28> at ata1-master UDMA33
GEOM: create disk da0 dp=0xc64bbc50
GEOM: create disk da1 dp=0xc64bb450
da0 at mpt1 bus 0 target 0 lun 0
da0: <SEAGATE ST336732LW 2223> Fixed Direct Access SCSI-3 device 
da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da0: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
da1 at mpt1 bus 0 target 1 lun 0
da1: <SEAGATE ST336732LW 2223> Fixed Direct Access SCSI-3 device 
da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da1: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
SMP: AP CPU #1 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #2 Launched!
(cd0:ata1:0:0:0): Recovered Sense
(cd0:ata1:0:0:0): READ CD RECORDED CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0 
(cd0:ata1:0:0:0): CAM Status: SCSI Status Error
(cd0:ata1:0:0:0): SCSI Status: Check Condition
(cd0:ata1:0:0:0): NOT READY asc:3a,0
(cd0:ata1:0:0:0): Medium not present
cd0 at ata1 bus 0 target 0 lun 0
cd0: <PHILIPS DVD+RW-D28 1.62> Removable CD-ROM SCSI-0 device 
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
Mounting root from ufs:/dev/da0s1a
em1: Link is up 1000 Mbps Full Duplex
em1: Link is up 1000 Mbps Full Duplex
em1: Link is up 1000 Mbps Full Duplex
em1: Link is up 1000 Mbps Full Duplex


Received on Fri Oct 03 2003 - 09:55:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:24 UTC