On Mon, May 25, 2009 at 9:12 AM, Thomas Backman <serenity_at_exscape.org> wrote: > On May 25, 2009, at 05:39 PM, Freddie Cash wrote: >> On Mon, May 25, 2009 at 2:13 AM, Thomas Backman <serenity_at_exscape.org> >> wrote: >>> On May 24, 2009, at 09:02 PM, Thomas Backman wrote: >>> >>>> So, I was playing around with RAID-Z and self-healing... >>> >>> Yet another follow-up to this. >>> It appears that all traces of errors vanish after a reboot. So, say you >>> have a dying disk; ZFS repairs the data for you, and you don't notice (unless >>> you check zpool status). Then you reboot, and there's NO (easy?) way that I >>> can tell to find out that something is wrong with your hardware! >> >> On our storage server that was initially configured using 1 large >> 24-drive raidz2 vdev (don't do that, by the way), we had 1 drive go >> south. "zpool status" was full of errors. And the error counts >> survived reboots. Either that, or the drive was so bad that the error >> counts started increasing right away after a boot. After a week of >> fighting with it to get the new drive to resilver and get added to the >> vdev, we nuked it and re-created it using 3 raidz2 vdevs each >> comprised of 8 drives. >> >> (Un)fortunately, that was the only failure we've had so far, so can't >> really confirm/deny the "error counts reset after reboot". > > Was this on FreeBSD? 64-bit FreeBSD 7.1 using ZFS v6. SATA drives connected to 3Ware RAID controllers, but configured as "Single Drive" arrays not using hardware RAID in any way. > I have another unfortunate thing to note regarding this: after a reboot, > it's even impossible to tell *which disk* has gone bad, even if the pool is > "uncleared" but otherwise "healed". It simply says that a device has failed, > with no clue as to which one, since they're all "ONLINE"! Even when using -v? zpool status -v -- Freddie Cash fjwcash_at_gmail.comReceived on Mon May 25 2009 - 14:19:25 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:48 UTC