I built world after cvsup'ing -CURRENT this morning and am still having the same ATA READ_DMA hangs that started in early October on my system. I can repeat the hangs at will; the machine serves as an Amanda server, and launching a backup for itself plus 3 client machines is guaranteed to trigger it: ad0: TIMEOUT - READ_DMA retrying (2 retries left) ata0: resetting devices .. ad0: FAILURE - already active DMA on this device ad0: setting up DMA failed When this happens, the system is effectively dead until I reset it. I can run for days on end by booting with DMA disabled, but that's not really my ideal long-term solution as it slows the system to a crawl. The drive in question is a Western Digital WD1200JB-00DUA3 (Caviar 120GB special edition) attached to an Asus P3V4X (Via chipset) motherboard. The combination has worked perfectly from the server's 4.8-STABLE days, through 5.0, and up until the last two months when I started experiencing this immediately after an upgrade. Kernel config is essentially "GENERIC" with the older CPU types and WITNESS* and INVARIANT* options commented out, and with the SYS-V IPC settings recommended by PostgreSQL added. Build flags are very conservative: "CFLAGS= -O -pipe". sysutils/smartctl reports: SMART overall-health self-assessment test result: PASSED Basically, I'm about 99% sure that this hardware is OK. It worked right up to a big ATAng commit, then stopped working right immediately afterward. Does anybody have any suggestions of how I can run my machine in UDMA33/66 mode for more than a couple of hours without freezing? Below is the dmesg. I didn't want to stick it in the middle of my post: Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.2-CURRENT #1: Thu Dec 11 14:13:32 CST 2003 root_at_kanga.honeypot.net:/usr/obj/usr/src/sys/KANGA Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a7d000. Preloaded elf module "/boot/kernel/linprocfs.ko" at 0xc0a7d1f4. Preloaded elf module "/boot/kernel/linux.ko" at 0xc0a7d2a4. Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0a7d350. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel Pentium III (936.74-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x683 Stepping = 3 Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> real memory = 805289984 (767 MB) avail memory = 772505600 (736 MB) Pentium Pro MTRR support enabled npx0: [FAST] npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <ASUS P3V_4X > on motherboard pcibios: BIOS version 2.10 Using $PIR table, 8 entries at 0xc00f0e60 acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0 acpi_cpu0: <CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib0: slot 4 INTD is routed to irq 9 pcib0: slot 9 INTA is routed to irq 9 pcib0: slot 10 INTA is routed to irq 9 pcib0: slot 11 INTA is routed to irq 10 pcib0: slot 12 INTA is routed to irq 11 agp0: <VIA 82C691 (Apollo Pro) host to PCI bridge> mem 0xe4000000-0xe7ffffff at device 0.0 on pci0 pcib1: <PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 isab0: <PCI-ISA bridge> at device 4.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <VIA 82C596B UDMA66 controller> port 0xd800-0xd80f at device 4.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] uhci0: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 9 at device 4.2 on pci0 usb0: <VIA 83C572 USB controller> on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered ulpt0: HewLett Packard HP LaserJet 1200, rev 1.10/1.00, addr 2, iclass 7/1 ulpt0: using bi-directional mode ukbd0: Belkin Components USB-PS2 Adapter, rev 1.10/1.20, addr 3, iclass 3/1 kbd0 at ukbd0 ums0: Belkin Components USB-PS2 Adapter, rev 1.10/1.20, addr 3, iclass 3/1 ums0: 5 buttons and Z dir. viapropm0: SMBus I/O base at 0xe800 viapropm0: <VIA VT82C596A Power Management Unit> port 0xe800-0xe80f at device 4.3 on pci0 viapropm0: SMBus revision code 0x0 smbus0: <System Management Bus> on viapropm0 smb0: <SMBus generic I/O> on smbus0 fxp0: <Intel 82559 Pro/100 Ethernet> port 0xd000-0xd03f mem 0xd6800000-0xd68fffff,0xd7000000-0xd7000fff irq 9 at device 9.0 on pci0 fxp0: Ethernet address 00:d0:b7:0e:3a:4a miibus0: <MII bus> on fxp0 inphy0: <i82555 10/100 media interface> on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp1: <Intel 82559 Pro/100 Ethernet> port 0xb800-0xb83f mem 0xd5800000-0xd58fffff,0xd6000000-0xd6000fff irq 9 at device 10.0 on pci0 fxp1: Ethernet address 00:d0:b7:9e:bb:dd miibus1: <MII bus> on fxp1 inphy1: <i82555 10/100 media interface> on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sym0: <875> port 0xb400-0xb4ff mem 0xd4800000-0xd4800fff,0xd5000000-0xd50000ff irq 10 at device 11.0 on pci0 sym0: Tekram NVRAM, ID 7, Fast-20, SE, parity checking pci0: <display, VGA> at device 12.0 (no driver attached) fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> port 0x3f7,0x3f2-0x3f5 irq 6 drq 2 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio1 port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 orm0: <Option ROMs> at iomem 0xd4000-0xd4fff,0xd0000-0xd0fff,0xcc000-0xcffff,0xc0000-0xcafff on isa0 pmtimer0 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio2 at port 0x3e8-0x3ef irq 5 on isa0 sio2: type 16450 sio3: configured irq 9 not in bitmap of probed irqs 0 sio3: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 936743135 Hz quality 800 Timecounters tick every 10.000 msec acpi_cpu: throttling enabled, 16 steps (100% to 6.2%), currently 100.0% GEOM: create disk ad0 dp=0xc639f760 ad0: 114473MB <WDC WD1200JB-00DUA3> [232581/16/63] at ata0-master UDMA66 Waiting 15 seconds for SCSI devices to settle (probe1:sym0:0:1:0): phase change 6-7 6_at_002fe78c resid=4. (probe2:sym0:0:2:0): phase change 6-2 6_at_0035f98c resid=5. sa0 at sym0 bus 0 target 6 lun 0 sa0: <SEAGATE DAT 9SP40-000 9100> Removable Sequential Access SCSI-3 device sa0: 40.000MB/s transfers (20.000MHz, offset 16, 16bit) GEOM: create disk cd0 dp=0xc6394600 GEOM: create disk da0 dp=0xc63aec50 (cd0:sym0:0:2:0): phase change 6-2 6_at_0035f98c resid=5. cd0 at sym0 bus 0 target 2 lun 0 cd0: <RICOH RO-1420C 1.61> Removable CD-ROM SCSI-2 device cd0: 3.300MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present da0 at sym0 bus 0 target 1 lun 0 da0: <IOMEGA ZIP 100 J.03> Removable Direct Access SCSI-2 device da0: 3.300MB/s transfers da0: Attempt to query device size failed: NOT READY, Medium not present (cd0:sym0:0:2:0): phase change 6-2 6_at_0035f98c resid=5. (cd0:sym0:0:2:0): phase change 6-2 6_at_0035f98c resid=5. (cd0:sym0:0:2:0): phase change 6-2 6_at_0035f98c resid=5. (cd0:sym0:0:2:0): phase change 6-2 6_at_0035f98c resid=5. (da0:sym0:0:1:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0 (da0:sym0:0:1:0): CAM Status: SCSI Status Error (da0:sym0:0:1:0): SCSI Status: Check Condition (da0:sym0:0:1:0): NOT READY asc:3a,0 (da0:sym0:0:1:0): Medium not present (da0:sym0:0:1:0): Unretryable error Opened disk da0 -> 6 (da0:sym0:0:1:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0 (da0:sym0:0:1:0): CAM Status: SCSI Status Error (da0:sym0:0:1:0): SCSI Status: Check Condition (da0:sym0:0:1:0): NOT READY asc:3a,0 (da0:sym0:0:1:0): Medium not present (da0:sym0:0:1:0): Unretryable error Opened disk da0 -> 6 Mounting root from ufs:/dev/ad0s1a WARNING: / was not properly dismounted WARNING: /home was not properly dismounted /home: mount pending error: blocks 4 files 1 /home: superblock summary recomputed WARNING: /tmp was not properly dismounted WARNING: /usr was not properly dismounted /usr: mount pending error: blocks 24 files 2 WARNING: /usr/export was not properly dismounted WARNING: /usr/share was not properly dismounted WARNING: /var was not properly dismounted /var: mount pending error: blocks 360 files 7 /var: superblock summary recomputed WARNING: /var/amanda was not properly dismounted /var/amanda: superblock summary recomputed WARNING: /var/jail was not properly dismounted /var/jail: mount pending error: blocks 5420 files 3 /var/jail: superblock summary recomputed -- Kirk Strauser "94 outdated ports on the box, 94 outdated ports. Portupgrade one, an hour 'til done, 82 outdated ports on the box."
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:33 UTC