Re: panic: LK_RETRY set with incompatible flags (0x200400) or an error occured (11)

From: Andriy Gapon <avg_at_FreeBSD.org> Date: Tue, 18 Feb 2014 15:31:53 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:47 UTC

on 18/02/2014 15:18 Jeremie Le Hen said the following:
> On Sat, Feb 15, 2014 at 02:12:40PM +0200, Andriy Gapon wrote:
>> on 14/02/2014 21:18 Jeremie Le Hen said the following:
>>> I've just got another occurence of the exact same panic.  Any clue how
>>> to debug this?
>>
>> Could you please obtain *vp from frame 12 ?
> 
> Sure:
> 
> $1 = {v_tag = 0xffffffff815019a5 "zfs", v_op = 0xffffffff815164a0, 
>   v_data = 0xfffff80010dcb2e0, v_mount = 0xfffff80010dcd660, 
>   v_nmntvnodes = {tqe_next = 0xfffff80010dc7ce8, 
>     tqe_prev = 0xfffff80010dcd6c0}, v_un = {vu_mount = 0x0, 
>     vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, 
>   v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = {
>     lh_first = 0xfffff8005aeefcb0}, v_cache_dst = {tqh_first = 0x0, 
>     tqh_last = 0xfffff80010dc8050}, v_cache_dd = 0x0, v_lock = {
>     lock_object = {lo_name = 0xffffffff815019a5 "zfs", 
>       lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, 
>     lk_lock = 18446735277920538624, lk_exslpfail = 0, lk_timo = 51, 
>     lk_pri = 96}, v_interlock = {lock_object = {
>       lo_name = 0xffffffff80b46085 "vnode interlock", 
>       lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, 
>     mtx_lock = 4}, v_vnlock = 0xfffff80010dc8068, v_actfreelist = {
>     tqe_next = 0x0, tqe_prev = 0xfffff80010dc7da8}, v_bufobj = {
>     bo_lock = {lock_object = {
>         lo_name = 0xffffffff80b4e613 "bufobj interlock", 
>         lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, 
>       rw_lock = 1}, bo_ops = 0xffffffff80e2d440, 
>     bo_object = 0xfffff800a30bbd00, bo_synclist = {le_next = 0x0, 
>       le_prev = 0x0}, bo_private = 0xfffff80010dc8000, 
>     __bo_vnode = 0xfffff80010dc8000, bo_clean = {bv_hd = {
>         tqh_first = 0x0, tqh_last = 0xfffff80010dc8120}, bv_root = {
>         pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {
>         tqh_first = 0x0, tqh_last = 0xfffff80010dc8140}, bv_root = {
>         pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, 
>     bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, 
>   v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, 
>       tqh_last = 0xfffff80010dc8188}, rl_currdep = 0x0}, 
>   v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 7, 
>   v_usecount = 6, v_iflag = 512, v_vflag = 1, v_writecount = 0, 
>   v_hash = 3, v_type = VDIR}

So, VV_ROOT is indeed set in v_vflag.
Thank you.

>> The problem seems to be happening in this piece of ZFS code:
>>                 if (cnp->cn_flags & ISDOTDOT) {
>>                         ltype = VOP_ISLOCKED(dvp);
>>                         VOP_UNLOCK(dvp, 0);
>>                 }
>>                 ZFS_EXIT(zfsvfs);
>>                 error = vn_lock(*vpp, cnp->cn_lkflags);
>>                 if (cnp->cn_flags & ISDOTDOT)
>>                         vn_lock(dvp, ltype | LK_RETRY);
>>
>> ltype is apparently LK_SHARED and the assertion is apparently triggered by
>> EDEADLK error.  The error can occur only if a thread tries to obtain a lock in a
>> shared mode when it already has the lock exclusively.
>> My only explanation of how this could happen is that dvp == *vpp and cn_lkflags
>> is LK_EXCLUSIVE.  In other words, this is a dot-dot lookup that results in the
>> same vnode.  I think that this is only possible if dvp is the root vnode.
>> I am not sure if my theory is correct though.
>> Also, I am not sure if zfs_lookup() should be prepared to handle such a lookup
>> or if this kind of lookup should be handled by upper/other layers.  In this case
>> these would be VFS lookup code and nullfs code.
> 

-- 
Andriy Gapon