Re: ZFS checksum errors on umass(4) insertion

From: Damian Gerow <dgerow_at_afflictions.org>
Date: Thu, 16 Apr 2009 10:42:51 -0400
John Baldwin wrote:
: I have no idea how this would break what you are seeing.  The 
: zfs_get_xattrdir() function is only called from zfs_lookup() when 
: LOOKUP_XATTR is specified, and that only happens from the extended attribute 
: VOP routines.  Are you using extended attributes at all?  Also, have you 
: tried running with INVARIANTS and DEBUG_VFS_LOCKS to catch missing locks?

I've spent most of the past week running tests, with various combinations,
against sources dating back about two weeks ago.  I've been using a standard
GENERIC kernel with two modifications: I've removed umass, and added
DEBUG_VFS_LOCKS.  I also set vfs.zfs.debug=1, where debug.vfs_* and
debug.mpsafevfs are all kept at their defaults of 1.

What I've found:

1) Reverting the extended attribute locking change (r189967) does not change
the situation for me.  I still experience checksum issues and data loss.
(Unsurprisingly.)

2) Without umass loaded, I have been completely unable to trigger the issue.

3) Once umass is loaded, and the symptoms start cropping up, unloading umass
does not make them go away (again, unsurprisingly).  What I haven't yet
tested, but am currently working towards, is whether removing umass stops
further checksum errors from ocurring.

4) r189967 does remove some LORs for me, even though I don't use (that I
know of) extended attributes.

5) It seems that so long as umass is used at all, the symptoms will
eventually show up.  I've been able to trigger the symptoms by inserting
then removing a umass device immediately after boot, then ramping up the
workload.

6) The only difference made by vfs.zfs.debug=1 is that zfs reclaims are
logged.

I'm at a bit of a loss as to what to test next, other than checking for an
increased number of checksum errors after unloading umass.  However, I'm not
convinced this is going to highlight the actual problem.  I'm all ears as to
what to test for at this point, as I'm running out of ideas.

A little less wordy: help?

  - Damian
Received on Thu Apr 16 2009 - 12:42:56 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC