Re: repeatable ZFS panic: share->excl

From: Anonymous <swell.k_at_gmail.com>
Date: Fri, 13 Mar 2009 18:52:38 +0300
John Baldwin <john_at_baldwin.cx> writes:

> On Mar 12, 2009, at 11:35 PM, Attilio Rao <attilio_at_freebsd.org> wrote:
>
>> 2009/3/12, Anonymous <swell.k_at_gmail.com>:
>>> Tim Kientzle <kientzle_at_freebsd.org> writes:
>>>
>>>> Peter Schuller wrote:
>>>>>>  First problem I notice is a panic, which seems to occur every
>>>>>> time 'tar' is used. Might be unrelated to tar, but it definately
>>>>>> provokes it. Simply the tar command or a pkg_add causes it.
>>>>>
>>>>> See the "ZFS/extattr lockup"/"bsdtar lockup on current" threads
>>>>> from
>>>>> the past handful of days.
>>>>
>>>> I think this may be a different problem.
>>>> The earlier thread involved a ZFS bug that causes
>>>> it to lock up if it receives a request to enumerate
>>>> extended attributes on a file (via extattr_list_link
>>>> system call).  Tar recently added support for
>>>> backing up extended attributes.  I've disabled
>>>> that support until this particular ZFS problem can
>>>> be fixed.
>>>
>>>
>>> I guess you're wrong, it's same issue. Here is output from unmodified
>>> kernel (r189728) under qemu which looks exactly as on that
>>> screenshot. Perhaps, the panic is triggered by INVARIANTS.
>>>
>>> # lsextattr -h user foo
>>> shared lock of (lockmgr) zfs _at_ /usr/src/sys/kern/vfs_lookup.c:477
>>> while exclusively locked from /usr/src/sys/modules/zfs/../../cddl/
>>> contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:152
>>> panic: share->excl
>>> cpuid = 0
>>> KDB: enter: panic
>>> [thread pid 105 tid 100078 ]
>>> Stopped at      kdb_enter+0x3d: movq    $0,0x662538(%rip)
>>>
>>> db> bt
>>> Tracing pid 105 tid 100078 td 0xffffff00015bea80
>>> kdb_enter() at kdb_enter+0x3d
>>> panic() at panic+0x17b
>>> witness_checkorder() at witness_checkorder+0x16e
>>> __lockmgr_args() at __lockmgr_args+0xd1b
>>> vop_stdlock() at vop_stdlock+0x39
>>> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
>>> _vn_lock() at _vn_lock+0x57
>>> lookup() at lookup+0xf4
>>> namei() at namei+0x545
>>> zfs_listextattr() at zfs_listextattr+0x18c
>>> VOP_LISTEXTATTR_APV() at VOP_LISTEXTATTR_APV+0xb5
>>> extattr_list_vp() at extattr_list_vp+0x22a
>>> extattr_list_link() at extattr_list_link+0xc3
>>> syscall() at syscall+0x1e7
>>> Xfast_syscall() at Xfast_syscall+0xab
>>> --- syscall (439, FreeBSD ELF64, extattr_list_link), rip =
>>> 0x800692e4c, rsp = 0x7fffffffed08, rbp = 0x7fffffffede0 ---
>>>
>>> db> show all locks
>>> Process 105 (lsextattr) thread 0xffffff00015bea80 (100078)
>>> exclusive lockmgr zfs (zfs) r = 0 (0xffffff0001581578) locked _at_ /
>>> usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/
>>> fs/zfs/zfs_znode.c:152
>>> exclusive lockmgr zfs (zfs) r = 0 (0xffffff0001581a58) locked _at_ /
>>> usr/src/sys/kern/vfs_extattr.c:668
>>>
>>> db> show lockedvnods
>>> Locked vnodes
>>>
>>> 0xffffff00015819c0: tag zfs, type VREG
>>>    usecount 1, writecount 0, refcount 1 mountedhere 0
>>>    flags ()
>>>    v_object 0xffffff0001574898 ref 0 pages 0
>>>    lock type zfs: EXCL by thread 0xffffff00015bea80 (pid 105)
>>> #0 0xffffffff80530568 at __lockmgr_args+0x758
>>> #1 0xffffffff805c0fd9 at vop_stdlock+0x39
>>> #2 0xffffffff808557db at VOP_LOCK1_APV+0x9b
>>> #3 0xffffffff805dd6a7 at _vn_lock+0x57
>>> #4 0xffffffff805c2f20 at extattr_list_vp+0xb0
>>> #5 0xffffffff805c3173 at extattr_list_link+0xc3
>>> #6 0xffffffff8080f227 at syscall+0x1e7
>>> #7 0xffffffff807ec39b at Xfast_syscall+0xab
>>>
>>> 0xffffff00015814e0: tag zfs, type VDIR
>>>    usecount 1, writecount 0, refcount 1 mountedhere 0
>>>    flags ()
>>>    lock type zfs: EXCL by thread 0xffffff00015bea80 (pid 105)
>>> #0 0xffffffff80530568 at __lockmgr_args+0x758
>>> #1 0xffffffff805c0fd9 at vop_stdlock+0x39
>>> #2 0xffffffff808557db at VOP_LOCK1_APV+0x9b
>>> #3 0xffffffff805dd6a7 at _vn_lock+0x57
>>> #4 0xffffffff8108405e at zfs_znode_cache_constructor+0x4e
>>> #5 0xffffffff81085029 at zfs_znode_alloc+0x39
>>> #6 0xffffffff81085915 at zfs_mknode+0x205
>>> #7 0xffffffff810948f5 at zfs_make_xattrdir+0x155
>>> #8 0xffffffff81095bc3 at zfs_get_xattrdir+0xd3
>>> #9 0xffffffff810a4dbf at zfs_lookup+0x11f
>>> #10 0xffffffff810a51d8 at zfs_listextattr+0x128
>>> #11 0xffffffff80853255 at VOP_LISTEXTATTR_APV+0xb5
>>> #12 0xffffffff805c309a at extattr_list_vp+0x22a
>>> #13 0xffffffff805c3173 at extattr_list_link+0xc3
>>> #14 0xffffffff8080f227 at syscall+0x1e7
>>> #15 0xffffffff807ec39b at Xfast_syscall+0xab
>>
>> Could you please re-enable the extended attributes on bsdtar, try this
>> patch and report to me?:
>> http://www.freebsd.org/~attilio/zfs_vnops.diff

Before that I'd like lsextattr(8) to at least not panic.. Because I get
same panic with tar(1). Here is output from r189756 + your patch

# lsextattr -h user foo
vrele: negative ref count
0xffffff0001501270: tag zfs, type VDIR
    usecount 0, writecount 0, refcount 0 mountedhere 0
    flags (VI_FREE)
 VI_LOCKed    lock type zfs: UNLOCKED
#0 0xffffffff80531d8a at __lockmgr_args+0x51a
#1 0xffffffff805c2909 at vop_stdlock+0x39
#2 0xffffffff808571eb at VOP_LOCK1_APV+0x9b
#3 0xffffffff805defd7 at _vn_lock+0x57
#4 0xffffffff805c6e94 at lookup+0xf4
#5 0xffffffff805c81e5 at namei+0x545
#6 0xffffffff810a5d6d at zfs_listextattr+0x1fd
#7 0xffffffff80854c65 at VOP_LISTEXTATTR_APV+0xb5
#8 0xffffffff805c49ca at extattr_list_vp+0x22a
#9 0xffffffff805c4aa3 at extattr_list_link+0xc3
#10 0xffffffff80810c37 at syscall+0x1e7
#11 0xffffffff807eddab at Xfast_syscall+0xab
panic: vrele: negative ref cnt
cpuid = 0
KDB: enter: panic
[thread pid 84 tid 100077 ]
Stopped at      kdb_enter+0x3d: movq    $0,0x667d88(%rip)
db> bt
Tracing pid 84 tid 100077 td 0xffffff0001572000
kdb_enter() at kdb_enter+0x3d
panic() at panic+0x17b
vrele() at vrele+0x11a
zfs_listextattr() at zfs_listextattr+0x464
VOP_LISTEXTATTR_APV() at VOP_LISTEXTATTR_APV+0xb5
extattr_list_vp() at extattr_list_vp+0x22a
extattr_list_link() at extattr_list_link+0xc3
syscall() at syscall+0x1e7
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (439, FreeBSD ELF64, extattr_list_link), rip = 0x800692e9c, rsp = 0x7fffffffed08, rbp = 0x7fffffffede0 ---
db> show all locks
Process 84 (lsextattr) thread 0xffffff0001572000 (100077)
exclusive lockmgr zfs (zfs) r = 0 (0xffffff00015017e8) locked _at_ /usr/src/sys/kern/vfs_extattr.c:668
db> show lockedvnods
Locked vnodes

0xffffff0001501750: tag zfs, type VREG
    usecount 1, writecount 0, refcount 1 mountedhere 0
    flags ()
    v_object 0xffffff00014bd578 ref 0 pages 0
    lock type zfs: EXCL by thread 0xffffff0001572000 (pid 84)
#0 0xffffffff80531fc8 at __lockmgr_args+0x758
#1 0xffffffff805c2909 at vop_stdlock+0x39
#2 0xffffffff808571eb at VOP_LOCK1_APV+0x9b
#3 0xffffffff805defd7 at _vn_lock+0x57
#4 0xffffffff805c4850 at extattr_list_vp+0xb0
#5 0xffffffff805c4aa3 at extattr_list_link+0xc3
#6 0xffffffff80810c37 at syscall+0x1e7
#7 0xffffffff807eddab at Xfast_syscall+0xab

>>
>> (sorry if it is untested but I just don't have any ZFS machine).
>> Actually, I think the problem is that zfs_lookup() doesn't get the
>> correct handover once the extended attributes directory vnode is
>> retrieved.
>>
>> Thanks,
>> Attilio
>
> This is similar to the patch I've asked lulf_at_ to test except that it
> is longer and I fix a bug where zfs_lookup() can leak a vnode lock if
> the access check fails. :-)  The last one I sent to lulf_at_ is at
> www.FreeBSD.org/~jhb/patches/zfs_ea.patch .

Unless I'm doing smth stupid the panic from your patch looks same as one
from patch by attilio_at_.

%%%
_at__at_ -1,6 +1,6 _at__at_
 # lsextattr -h user foo
 vrele: negative ref count
-0xffffff0001501270: tag zfs, type VDIR
+0xffffff00016f3000: tag zfs, type VDIR
     usecount 0, writecount 0, refcount 0 mountedhere 0
     flags (VI_FREE)
  VI_LOCKed    lock type zfs: UNLOCKED
_at__at_ -10,7 +10,7 _at__at_
 #3 0xffffffff805defd7 at _vn_lock+0x57
 #4 0xffffffff805c6e94 at lookup+0xf4
 #5 0xffffffff805c81e5 at namei+0x545
-#6 0xffffffff810a5d6d at zfs_listextattr+0x1fd
+#6 0xffffffff810a527c at zfs_listextattr+0x18c
 #7 0xffffffff80854c65 at VOP_LISTEXTATTR_APV+0xb5
 #8 0xffffffff805c49ca at extattr_list_vp+0x22a
 #9 0xffffffff805c4aa3 at extattr_list_link+0xc3
_at__at_ -19,14 +19,14 _at__at_
 panic: vrele: negative ref cnt
 cpuid = 0
 KDB: enter: panic
-[thread pid 84 tid 100077 ]
+[thread pid 84 tid 100075 ]
 Stopped at      kdb_enter+0x3d: movq    $0,0x667d88(%rip)
 db> bt
-Tracing pid 84 tid 100077 td 0xffffff0001572000
+Tracing pid 84 tid 100075 td 0xffffff0001578700
 kdb_enter() at kdb_enter+0x3d
 panic() at panic+0x17b
 vrele() at vrele+0x11a
-zfs_listextattr() at zfs_listextattr+0x464
+zfs_listextattr() at zfs_listextattr+0x3f4
 VOP_LISTEXTATTR_APV() at VOP_LISTEXTATTR_APV+0xb5
 extattr_list_vp() at extattr_list_vp+0x22a
 extattr_list_link() at extattr_list_link+0xc3
_at__at_ -34,16 +34,16 _at__at_
 Xfast_syscall() at Xfast_syscall+0xab
 --- syscall (439, FreeBSD ELF64, extattr_list_link), rip = 0x800692e9c, rsp = 0x7fffffffed08, rbp = 0x7fffffffede0 ---
 db> show all locks
-Process 84 (lsextattr) thread 0xffffff0001572000 (100077)
-exclusive lockmgr zfs (zfs) r = 0 (0xffffff00015017e8) locked _at_ /usr/src/sys/kern/vfs_extattr.c:668
+Process 84 (lsextattr) thread 0xffffff0001578700 (100075)
+exclusive lockmgr zfs (zfs) r = 0 (0xffffff00016f3578) locked _at_ /usr/src/sys/kern/vfs_extattr.c:668
 db> show lockedvnods
 Locked vnodes
 
-0xffffff0001501750: tag zfs, type VREG
+0xffffff00016f34e0: tag zfs, type VREG
     usecount 1, writecount 0, refcount 1 mountedhere 0
     flags ()
-    v_object 0xffffff00014bd578 ref 0 pages 0
-    lock type zfs: EXCL by thread 0xffffff0001572000 (pid 84)
+    v_object 0xffffff0001563640 ref 0 pages 0
+    lock type zfs: EXCL by thread 0xffffff0001578700 (pid 84)
 #0 0xffffffff80531fc8 at __lockmgr_args+0x758
 #1 0xffffffff805c2909 at vop_stdlock+0x39
 #2 0xffffffff808571eb at VOP_LOCK1_APV+0x9b
%%%

>
> --
> John Baldwin
Received on Fri Mar 13 2009 - 14:52:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:43 UTC