Re: Apparently spurious ZFS CRC errors (was Re: ZFS data error without reasons)

From: Pieter de Goeje <pieter_at_degoeje.nl>
Date: Wed, 25 Mar 2009 22:51:22 +0100
On Wednesday 25 March 2009 21:46:27 army.of.root wrote:
> Alexey Shuvaev wrote:
> > On Wed, Mar 25, 2009 at 07:38:32PM +0100, Bernd Walter wrote:
> >> On Wed, Mar 25, 2009 at 06:04:08PM +0000, Mark Powell wrote:
> >>> On Wed, 25 Mar 2009, Bernd Walter wrote:
> >>>> On Wed, Mar 25, 2009 at 03:21:28PM +0100, Alexander Leidinger wrote:
> >>>> I wouldn't be surprised if the problem is in the drive firmware.
> >>>> Preread and wc both have the potential to put a lot load to the drives
> >>>> and can trigger bugs that otherwise wouldn't matter.
> >>>
> >>> I've emailed WD support for more info. Not expecting much though.
> >>> From reading other threads on these Green Power drives them seem rather
> >>> crap. This is my model and firmware:
> >>>
> >>> http://www.datacent.com/datarecovery/hdd/western_digital/WD10EADS-00L5B
> >>>1
> >>>
> >>> There's some head park problem too, but with 5s ZFS sync I don't think
> >>> it applies in this case:
> >>>
> >>> http://www.silentpcreview.com/forums/viewtopic.php?t=51401&postdays=0&p
> >>>ostorder=asc&start=120&sid=a1caf68d80ef8fecc5d9e86defde4c19
> >>> http://kerneltrap.org/mailarchive/linux-kernel/2008/4/9/1386304
> >>>
> >>>> I also have a system running WD drives and ECC RAM which show CRC
> >>>> errors
> >>>>
> >>> >from time to time, while all other systems have no CRC problem at all.
> >>>
> >>> Interesting. Are those CRC problems with WC on or off?
> >>
> >> WC is on, prefetch is off, but only because it had bad performance with
> >> MySQL.
> >> Drives are <WDC WD3200AAKS-00SBA0/12.01B01> Serial ATA II
> >> I don't know if it is with the drives, but other reasons are less
> >> likely in my opinion.
> >> The system is located in a data center and since I only get a few errors
> >> I decided to live with it and not to debug it further.
> >
> > Hello!
> >
> > Me too...
> >
> > I don't use zfs, just ufs2 + soft updates, but I see sometimes rather
> > heavy data corruption (most often on / filesystem).
> > No kernel messages, I can shut down the system successfully just
> > to find the remnants of filesystems on the next boot.
> > It doesn't happen often, I think compiling ports in a jail + some
> > activity in the host increase the probability of a failure.
> >
> > The drive is:
> > ATA channel 3:
> >     Master:  ad6 <WDC WD5000AAKS-00C8A0/12.01C02> SATA revision 2.x
> >
> > hw.ata.wc=1 (default)
> >
> > FWIW,
> > Alexey.
>
> Hi :)
>
> Damn f**k ! - I just bought WD harddrives for my Workstation...

I find this all very odd because I have a pair of WDC WD6400AAKS-00A7B0 drives 
using UFS2 + SU on the latest -CURRENT (i386) and no corruption at all (as 
far as I can tell ofcourse). The drives are pretty fast btw. Just for fun I 
gstriped both of them and got speeds in excess of 200MB/sec.

On another PC I've got a WDC WD10EACS-00ZJB0 1TB "Green Power" disk, also 
using UFS2+SU and no problems at all. That PC is running 7-STABLE however.

>
> is there any way to detect silent data corruption without ZFS ?

Perhaps if you mirror a couple of drives you can manually verify that they 
have the same md5sum.
>
> best regards
>
> PS: Thanks for all your work, I'm looking soo forward to 8.0 I cant even
> tell you how much :)

Me too :-)

--
Pieter de Goeje
Received on Wed Mar 25 2009 - 20:51:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:45 UTC