Re: panic: LK_RETRY set with incompatible flags

From: Sergey Kandaurov <pluknet_at_gmail.com>
Date: Mon, 4 Feb 2013 15:11:49 +0300
On 4 February 2013 05:50, Rick Macklem <rmacklem_at_uoguelph.ca> wrote:
> Konstantin Belousov wrote:
>> On Sat, Feb 02, 2013 at 09:30:39PM -0500, Rick Macklem wrote:
>> > Andriy Gapon wrote:
>> > > on 31/01/2013 15:29 Sergey Kandaurov said the following:
>> > > > Hi.
>> > > >
>> > > > Got this assertion on idle NFS server while `ls -la
>> > > > /.zfs/shares/'
>> > > > issued on NFS client.
>> > > > kern/vfs_vnops.c:_vn_lock()
>> > > >                 KASSERT((flags & LK_RETRY) == 0 || error == 0,
>> > > >                     ("LK_RETRY set with incompatible flags
>> > > >                     (0x%x) or
>> > > > an error occured (%d)",
>> > > >
>> > > > panic: LK_RETRY set with incompatible flags (0x200400) or an
>> > > > error
>> > > > occured (11)
>> > > >
>> > > > What does that mean and how is it possible? As you can see, both
>> > > > parts
>> > > > of assertion failed.
>> > > > 11 is EDEADLK
>> > > > 0x200400: LK_RETRY & LK_UPGRADE
>> > >
>> > > LK_SHARED, not LK_UPGRADE.
>> > > Apparently the thread already holds an exlusive lock on the vnode,
>> > > which you
>> > > confirm below.
>> > >
>> > >
>> > > > Tracing pid 2943 tid 101532 td 0xfffffe004f5b7000
>> > > > kdb_enter() at kdb_enter+0x3e/frame 0xffffff848e45ef50
>> > > > vpanic() at vpanic+0x147/frame 0xffffff848e45ef90
>> > > > kassert_panic() at kassert_panic+0x136/frame 0xffffff848e45f000
>> > > > _vn_lock() at _vn_lock+0x70/frame 0xffffff848e45f070
>> > > > zfs_lookup() at zfs_lookup+0x392/frame 0xffffff848e45f100
>> > > > zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6d/frame
>> > > > 0xffffff848e45f240
>> > > > VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0xc2/frame
>> > > > 0xffffff848e45f260
>> > > > vfs_cache_lookup() at vfs_cache_lookup+0xcf/frame
>> > > > 0xffffff848e45f2b0
>> > > > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xc2/frame 0xffffff848e45f2d0
>> > > > lookup() at lookup+0x548/frame 0xffffff848e45f350
>> > > > nfsvno_namei() at nfsvno_namei+0x1a5/frame 0xffffff848e45f400
>> > > > nfsrvd_lookup() at nfsrvd_lookup+0x13a/frame 0xffffff848e45f6b0
>> > > > nfsrvd_dorpc() at nfsrvd_dorpc+0xca5/frame 0xffffff848e45f8a0
>> > > > nfssvc_program() at nfssvc_program+0x482/frame
>> > > > 0xffffff848e45fa00
>> > > > svc_run_internal() at svc_run_internal+0x1e9/frame
>> > > > 0xffffff848e45fba0
>> > > > svc_thread_start() at svc_thread_start+0xb/frame
>> > > > 0xffffff848e45fbb0
>> > > > fork_exit() at fork_exit+0x84/frame 0xffffff848e45fbf0
>> > > > fork_trampoline() at fork_trampoline+0xe/frame
>> > > > 0xffffff848e45fbf0
>> > > > --- trap 0xc, rip = 0x800883e9a, rsp = 0x7fffffffd488, rbp =
>> > > > 0x7fffffffd730 ---
>> > > >
>> > > > db> show lockedvnods
>> > > > Locked vnodes
>> > > >
>> > > > 0xfffffe02e21b11d8: tag zfs, type VDIR
>> > > >     usecount 4, writecount 0, refcount 4 mountedhere 0
>> > > >     flags (VI_ACTIVE)
>> > > >     v_object 0xfffffe02d9f2eb40 ref 0 pages 0
>> > > >     lock type zfs: EXCL by thread 0xfffffe004f5b7000 (pid 2943,
>> > > >     nfsd,
>> > > > tid 101532)
>> > > >
>> > > >
>> > > >
>> > I just took a look at zfs_vnops.c and I am probably missing
>> > something,
>> > but I can't see how this ever worked for a lookup of ".." when at
>> > the
>> > root (unless ZFS doesn't do the ".." is the current directory when
>> > at
>> > the root).
>> >
>> > Here's the code snippet:
>> > 1442 if (error == 0 && (nm[0] != '.' || nm[1] != '\0')) {
>> > 1443 int ltype = 0;
>> > 1444
>> > 1445 if (cnp->cn_flags & ISDOTDOT) {
>> > 1446       ltype = VOP_ISLOCKED(dvp);
>> > 1447       VOP_UNLOCK(dvp, 0);
>> > 1448 }
>> > 1449 ZFS_EXIT(zfsvfs);
>> > 1450 error = zfs_vnode_lock(*vpp, cnp->cn_lkflags);
>> > 1451 if (cnp->cn_flags & ISDOTDOT)
>> > 1452       vn_lock(dvp, ltype | LK_RETRY);
>> > 1453 if (error != 0) {
>> > 1454       VN_RELE(*vpp);
>> > 1455       *vpp = NULL;
>> > 1456       return (error);
>> > 1457 }
>> >
>> > Maybe line# 1451 should be changed to:
>> >         if ((cnp->cn_flags & ISDOTDOT) && *vpp != dvp)
>> >
>> > I'm not at all familiar with ZFS, so I've probably way
>> > off the mark on this, rick
>> > ps: I hope kib and jhb don't mind being added as cc's, since
>> >     they are familiar with this stuff, although maybe not ZFS
>> >     specifics.
>>
>> VFS (should) never call VOP_LOOKUP for the dotdot and root vnode.
>> The logic in the lookup() prevents it.
>>
> Correcting my previous posts, like usual. If you look at the above snippet of
> code, it seems that zfs_lock_vnode() must lock the same vnode as
> *dvp. Notice that vn_lock() is only called when ISDOTDOT is set and the
> code does a VOP_UNLOCK(dvp, 0); for this case, just before the
> zfs_vnode_lock().
>
> This assumes that the vn_lock() call at #1452 causes the panic.
> This is the only vn_lock() call in zfs_lookup(), so unless the compiler
> has done something weird, it seems the case.
>
> Sergey, do you have this kernel handy? If so, maybe you could check the
> line# for zfs_lookup+0x392. (If you haven't done this before, just email
> and I'll give you the steps. You just need the kernel.symbols file for
> the kernel that panic'd.)

Yep, kgdb returned zfs_vnops.c:1453.


-- 
wbr,
pluknet
Received on Mon Feb 04 2013 - 11:11:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:34 UTC