About a month ago, I bought a new SATA controller and a 160 GB Seagate SATA drive for my -CURRENT machine. All was working fine until about a week ago. Then, the drive started experiencing hard, unrecoverable DMA errors. I RMA'd the drive, then bought a new Maxtor 80 GB SATA drive (just yesterday). I started a buildworld on this drive, and it religiously fails about half-way through all the time (never at exactly the same place twice, however). The kernels I had when the failures occurred were: FreeBSD fugu.marcuscom.com 5.2-BETA FreeBSD 5.2-BETA #0: Mon Nov 24 23:14:49 EST 2003 gnome_at_fugu.marcuscom.com:/space/obj/usr/src/sys/FUGU i386 FreeBSD fugu.marcuscom.com 5.1-CURRENT FreeBSD 5.1-CURRENT #0: Mon Nov 17 21:23:07 EST 2003 gnome_at_fugu.marcuscom.com:/space/obj/usr/src/sys/FUGU i386 Kernels before that did not experience the problem. The buildworld fails with an Input/Output error, then I see the following on the console: Nov 26 02:35:12 fugu kernel: ad4: WARNING - WRITE_DMA recovered from missing interrupt Nov 26 02:35:12 fugu kernel: ad4: FAILURE - WRITE_DMA status=ff<BUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR> error=0 Nov 26 02:35:22 fugu kernel: ad4: WARNING - READ_DMA recovered from missing interrupt Nov 26 02:35:22 fugu kernel: ad4: FAILURE - READ_DMA status=ff<BUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR> error=0 ... Nov 26 04:37:24 fugu kernel: ad4: timeout sending command=ca Nov 26 04:37:24 fugu kernel: ad4: error issuing DMA command At this point, the machine is unusable, and the above two lines scroll by continuously until the machine is rebooted. Here are the dmesg specifics for the controller and drive: atapci1: <SiI 3112 SATA150 controller> port 0x14b0-0x14bf,0x14c0-0x14c3,0x14c8-0x14cf,0x14c4-0x14c7,0x14d0-0x14d7 mem 0xe800a000-0xe800a1ff irq 9 at device 16.0 on pci0 GEOM: create disk ad4 dp=0xc5246460 ad4: 78167MB <Maxtor 6Y080M0> [158816/16/63] at ata2-master UDMA133 Nothing else was changed in the machine except the specific version of -CURRENT since the time things worked and now. In addition to replacing the drive, I have replaced the SATA cable as well. My plan is to revert the ATA drivers to two weeks ago, and see if the problem persists. Failing that, I will test to see if this is a cooling problem. Failing that, I will replace the SATA controller. However, I wanted to know if I'm barking up the wrong tree, and perhaps this is a software issue. Thanks. Joe -- PGP Key : http://www.marcuscom.com/pgp.asc
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:31 UTC