Hi all, big mysterious bug is lingering somwhere. (Machine: C3, 256MB, 2x 30GB 2,5" IDE, SIL0680 controller) One of my drives failed with the following recovered from messages: Sep 16 01:47:44 tek kernel: ad4: WRITE command timeout tag=0 serv=0 - resetting Sep 16 01:47:45 tek kernel: ata2: resetting devices .. Sep 16 01:47:45 tek kernel: ad4: removed from configuration Sep 16 01:47:45 tek kernel: ar0: WARNING - mirror lost Sep 16 01:47:45 tek kernel: ad4: deleted from ar0 disk0 Sep 16 01:47:45 tek kernel: done This was at 1:47 but the machine ran until about 5:30. Then it died (no message!) When I tried to reboot, BIOS complained about missing MBR. And indeed, when I opened the server and connected the drives to another box, BOTH drives had no partition table!!!! I got a correct bsdlabel from both, ad6 and ad6s1. How can this happen? Bug in ata? Bug in GEOM? Nobody was loged in and also nobody can log in so the machine deleted it. That's really sure! My fix was to use the fixit CD and wrote a new one with: fdisk -I -B -b /boot/boot1 ar0 fdisk -u ar0 (to change the starting sector from 63 to 0) fsck found a few errors but the server is up and running again. Søren: I remember you're planning better RAID management support. Will it be possible to control the ar0 by the controller's BIOS in the future? When I rebuilt the array with the BIOS (which took 6 hours!) FreeBSD still reported a degraded RAID1! This was really annoying Thanks, -Harry
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:22 UTC