I have a box that until Friday night was running a Nov '05 -CURRENT solidly. After an upgrade, it started spewing out kernel: ad4: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=38617823 style warnings at the slightest provocation. A "find / -xdev -print | xargs cat >> /dev/null" could bring it about in a second or two; not uncommonly, the arduous effort of spawning off 'sh' for single-user mode was enough to put it over the cliff. The system runs an ataraid RAID-1 across ad4 and ad6; which got the first errors was pretty luck of the draw on any given boot. They're on a Promise TX2200 card: atapci0: <Promise PDC20571 SATA150 controller> port 0xc000-0xc07f,0xc400-0xc4ff mem 0xeb420000-0xeb420fff,0xeb400000-0xeb41ffff irq 15 at device 13.0 on pci0 The card/drives were tried in 3 very different motherboards, all of which failed identically. BIOSen were scoured for "make PCI edgy" options, which were all turned off (though none exhibited a "enable bus master" option, as one seemingly-related mail thread ended with). I tried using the loader variable to force the drives to PIO mode to jam the brakes on, but it didn't seem to work at all (maybe it doesn't affect SATA?). I tried splitting the RAID so it only dealt with one drive; made no difference. The -CURRENT build was from identical sources to those currently sitting on this machine, so I can supply $Id$'s if it'll help. Sadly, the system needed to be running, so it's not available for further experimentation. It ran flawlessly with that Nov '05 -CURRENT, and is now running flawlessly on RELENG_6. -- Matthew Fuller (MF4839) | fullermd_at_over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream.Received on Mon Mar 12 2007 - 09:17:08 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:06 UTC