Re: ZFS: Silent/hidden errors, nothing logged anywhere

From: Thomas Backman <serenity_at_exscape.org>
Date: Sat, 13 Jun 2009 17:13:00 +0200
On Jun 13, 2009, at 05:06 PM, Pawel Jakub Dawidek wrote:

> On Fri, Jun 12, 2009 at 02:01:57PM -0700, Kip Macy wrote:
>> On Fri, Jun 12, 2009 at 10:32 AM, Thomas  
>> Backman<serenity_at_exscape.org> wrote:
>>> OK, so I filed a PR late May (kern/135050):
>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=135050 .
>>> I don't know if this is a "feature" or a bug, but it really should  
>>> be
>>> considered the latter. The data could be repaired in the  
>>> background without
>>> the user ever knowing - until the disk dies completely. I'd prefer  
>>> to have
>>> warning signs (i.e. checksum errors) so that I can buy a  
>>> replacement drive
>>> *before* that.
>>>
>>> Not only does this mean that errors can go unnoticed, but also  
>>> that it's
>>> impossible to figure out which disk is broken, if ZFS has  
>>> *temporarily*
>>> repaired the broken data! THAT is REALLY bad!
>>> Is this something that we can expect to see changed before 8.0- 
>>> RELEASE?
>>
>>
>> I'm fairly certain that we've discussed this already. Solaris uses  
>> FMA
>> - I don't think that I'll get to a "real fix" any time soon. The time
>> that I do have will go to addressing stability problems (memory
>> over-allocation, NFS interaction, control directory mounts) all of
>> which cause panics. Maintaining them persistently in the label  
>> doesn't
>> make sense  -  when do you drop them? Would a simple log message  
>> about
>> the number of checksum errors suffice?
>
> We do log such errors. Solaris uses FMA and for FreeBSD I use devd.  
> You
> can find the following entry in /etc/devd.conf:
> ...
> If you see nothing in your logs, there must be a bug with reporting  
> the
> problem somewhere or devd is not running (it should be enabled by
> default).
Awesome! After checking further I did indeed find a bunch of such  
messages in messages.0.bz2.
One thing less to worry about, I guess. :)

Regards,
Thomas
Received on Sat Jun 13 2009 - 13:13:13 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:49 UTC