Re: ZFS kernel panic

From: Pawel Jakub Dawidek <pjd_at_FreeBSD.org>
Date: Tue, 28 Aug 2007 22:55:55 +0200
On Tue, Aug 28, 2007 at 01:48:34PM -0700, Bakul Shah wrote:
> Pawel Jakub Dawidek <pjd_at_FreeBSD.org> wrote:
> > On Tue, Aug 28, 2007 at 10:02:42AM -0700, Bakul Shah wrote:
> > > > When you don't use redundant configuration (no mirror, no raidz, no
> > > > copies>1) then ZFS is going to panic on a write failure. It looks like
> > > > ZFS found a bad block on your disk.
> > >
> > > Does SUN really say this about ZFS?  Is this acceptable in a
> > > production environment?  What if one of your mirrored disk
> > > fails and in the "degraded" environment (before you have had
> > > a chance to replace the bad disk) ZFS discovers that a write
> > > fails?  Why can't it find an alternative block to write to?
> > 
> > There were many complains on zfs-discuss_at_, you may want to look into
> > archive. The short version is that many users doesn't like that, and it
> > should change in the future - because of COW model it should be quite
> > easy to just mark block as bad and take next one, but it's not currently
> > implemented. It's much less of a problem when one uses redundancy.
> 
> Good to know others are complaining too :-)
> 
> My real concern is the panic.  This situation may be rare if
> using redundancy + regular scrubbing, but it can definitely
> occur.  And as long as non redundant ZFS is *allowed*, you
> pretty much have to deal with it without any panicking.
> 
> Originally panic() was used to indicate that some *system
> invariant* has been violated.  That either meant a hardware
> error or an unknown software error but in any case some data
> structure was likely corrupted and continuing can make
> matters worse.  But that is not the case here (in general).
> zfs does not have the appropriate information to be able to
> decide whether the write error is fatal.
> 
> The simplest thing to do in case of a write error is to
> simply ignore it.  You *will* catch this problem when you try
> to read this block.  One step better is to do what you
> suggest.

You can't ignore write error, because application already assumed the
write succeeded, which can lead to misbehaviour later. ZFS cannot yet
handle write error, so it panics to preserve data consistency. This is
the good reaction on ZFS side until skipping bad blocks is not
implemented.

> What happens now when you do use redundancy and there is a
> write error while writing one of the copies?  Does the system
> panic or is this error ignored?

Don't remember off hand, but component is probably marked as bad and
vdev group goes to degraded state. You can simulate this easly with
gnop(8).

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd_at_FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Received on Tue Aug 28 2007 - 18:57:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:16 UTC