In a different thread, I (garance) wrote: > >At 10:30 PM +0100 11/18/04, Søren Schmidt wrote: >>Garance A Drosihn wrote: >> >>>I am trying to pin down problems "FAILURE - WRITE_DMA timed out" >>>in a new PC that I have. I had some local shop build this for me, >>>and apparently there were "a few" miscommunications in what I >>>thought I asked for, and what they actually built. >>> >>>The machine ended up with two SATA controllers: >>> atapci0: <SiI 3112 SATA150 controller> -- on the motherboard >>> atapci1: <VIA 6420 SATA150 controller> -- on a PCI card >> >>I think its the other way around, the VIA chip is part of the >>motherboard chipset, the SiI is a "loose" PCI compatible chip. > >Ugh. You are correct. Somewhere along the line I got the two >mixed up. So now have I removed the PCI-based SATA card, and >connected the Western Digital hard disk to the on-board SATA. >I have just done a complete buildworld/installworld cycle for >5.3-STABLE. I did not see a single WRITE_DMA time-out message. So far so good. >But looking around the web for awhile, it looks like this model of >Western Digital is not a native SATA drive. So I think I will >replace it just to avoid any further hassles, even though I did not >get any errors with this drive once I was using the right controller. I have now switched from that Western Digital drive to a Seagate Barracuda 7200.7 120-gig (ST3120026AS). The drive seems to be working fairly well, but now I sometimes see some combination like the following three lines: Dec 2 20:29:50 kernel: Interrupt storm detected on "irq20: atapci0"; throttling interrupt source Dec 2 20:29:54 kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=20627679 Dec 2 20:29:54 kernel: ad4: FAILURE - WRITE_DMA timed out Where atapci0: <VIA 6420 SATA150 controller> And ad4: 114473MB <ST3120026AS/3.56> [232581/16/63] at ata2-master SATA150 This does not come up often, and it usually doesn't cause any noticeable problem. As it luck would have it, the one time it has caused problems is during installworlds. I just did 18 buildworlds in a row without any problem. I built and installed the new kernel, rebooted into single-user, and the system paniced early in the installworld. I rebooted into single-user again, and this time it was *almost* finished with installworld when the system simply hung after a "ad4: FAILURE - WRITE_DMA timed out" message. Now I'm back up in multi-user mode, and I just completed another buildworld without any problem. I did get the above set of messages, but nothing after that. (I did see several sets of WRITE_DMA error messages during the installworlds). This is on a recent snapshot of 5.3-stable. Should I just switch back to the western digital? Or is it that the new disk is fast enough that the kernel *thinks* something is wrong with it, and starts throttling it? Or maybe I have a bad SATA cable? If it wasn't for the panics/hangs during installworld, I would think that everything was working quite well. Of course, that is about the worst time to be getting system panics! I tried getting a core dump of the panic, but 'call doadump' complained that no dump device had been set. I'm now looking at /etc/rc.d/dumpon so I should know how to set that up the next time I'm in single-user mode. -- Garance Alistair Drosehn = gad_at_gilead.netel.rpi.edu Senior Systems Programmer or gad_at_freebsd.org Rensselaer Polytechnic Institute or drosih_at_rpi.eduReceived on Fri Dec 03 2004 - 01:31:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:23 UTC