Re: lock order reversals w/ backtrace

From: Thomas Hoffmann <trh411_at_gmail.com> Date: Sat, 25 Jan 2014 09:54:01 -0500 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:46 UTC

On Fri, Jan 24, 2014 at 11:47 PM, Thomas Hoffmann <trh411_at_gmail.com> wrote:

> On Fri, Jan 24, 2014 at 9:52 AM, Thomas Hoffmann <trh411_at_gmail.com> wrote:
>
>> On Thu, Jan 23, 2014 at 11:49 AM, Hans Petter Selasky <
>> hans.petter.selasky_at_bitfrost.no> wrote:
>>
>>>  Hi,
>>>
>>> Can you see if you can snap some keywords of the backtraces, like usb_xxx usbdx_xxx cam scsi or something like that.
>>>
>>> Else I believe there are some sysctl options to prevent the final reboot somehow so that you can write down the messages.
>>>
>>>
>>> --HPS
>>>
>>>
>>> -----Original message-----
>>> > From:Thomas Hoffmann <trh411_at_gmail.com>
>>> > Sent: Thursday 23rd January 2014 17:15
>>>
>>> > To: freebsd-current <freebsd-current_at_freebsd.org>
>>> > Subject: lock order reversals w/ backtrace
>>> >
>>> > A few days ago I started running 11.0-CURRENT at r260971 for the first time.
>>>
>>>
>>> >
>>> > The last couple of times I shutdown my system I noticed 2 or 3 short "lock
>>> > order reversal" messages with accompanying backtraces scroll by. Do these
>>> > messages represent a problem that I should report or can I ignore them as
>>>
>>>
>>> > debug in nature? If I should report them, how or where do these messages
>>> > get logged? I can find no reference to them in syslog or anywhere else upon
>>> > my subsequent reboot.
>>> >
>>> > I also had a couple of these messages pop up the other day while
>>>
>>>
>>> > mounting/umounting USB thumb drives. I did not think anything of them at
>>> > the time as the mounts/umounts completed successfully.
>>> >
>>> > Please advise. Thanks.
>>> >
>>> > -Tom
>>>
>>>  I managed to snap a photo of my screen during shutdown.
>> Here is the full text of the first of two lock order reversals I got last
>> night during system shutdown, both of which are zfs-related to (it
>> appears?) unmounts. A few lines got chopped as they were out-of-frame when
>> I took the photo:
>>
>> shutting down local daemons:
>> lock order reversal:
>>   1st 0xfffff801194f67c8 zfs (zfs) _at_ /usr/src/sys/kern/vfs_mount.c:1237
>>   2nd 0xfffff801194f6420 syncer (syncer) _at_
>> /usr/src/sys/kern/vfs_subr.c:22[..chopped...]
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_delf_wrapper+0x2b/frame
>> 0xfffffe01ac78[...chopped...]
>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe01ac784650
>> witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe01ac7846e0
>> __lockmgr_args() __lockmgr_args+0x878/frame 0xfffffe01ac784810
>> vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe01ac784830
>> VOP_LOCK1_APV() at VOP_LOCK1_APV+0xf5/frame 0xfffffe01ac784860
>> _vn_lock() at _vn_lock+0xab/frame 0xfffffe01ac7848d0
>> vputx() at vputx+0x240/frame 0xfffffe01ac784930
>> dounmount at dounmount+0x327/frame 0xfffffe01ac7849b0
>> sys_unmount() at sys_unmount+0x356/frame 0xfffffe01ac784ac0
>> amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe01ac784bf0
>> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe01ac784bf0
>> --- syscall (22, FreeBSD ELF64, sys_unmount, rip = 0x80191c72a,
>> rsp[...chopped...]
>>
>> I have a zpool on an external USB HDD that I export at shutdown via
>> rc.shutdown.local. Don't know if that is contributing to this or not. When
>> I get a chance to bring down the system I will manually export it before
>> shutdown to see if that makes any difference.
>>
>
> Was able to capture the full text of the lock order reversal I've been
> experiencing at shutdown. Turns out to be associated with the export of one
> of my zpools that is hosted on an external USB HDD. Normally I export the
> zpool from rc.shutdown.local, but tonight I exported it manually before I
> shutdown and got the following.
>
> lock order reversal:
> : 1st 0xfffff8011952b5f0 zfs (zfs) _at_ /usr/src/sys/kern/vfs_mount.c:1237
> : 2nd 0xfffff8011952b068 syncer (syncer) _at_
> /usr/src/sys/kern/vfs_subr.c:2212
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe01add6e5a0
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe01add6e650
> witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe01add6e6e0
> __lockmgr_args() at __lockmgr_args+0x878/frame 0xfffffe01add6e810
> vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe01add6e830
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0xf5/frame 0xfffffe01add6e860
> _vn_lock() at _vn_lock+0xab/frame 0xfffffe01add6e8d0
> vputx() at vputx+0x240/frame 0xfffffe01add6e930
> dounmount() at dounmount+0x327/frame 0xfffffe01add6e9b0
> sys_unmount() at sys_unmount+0x356/frame 0xfffffe01add6eae0
> amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe01add6ebf0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe01add6ebf0
> --- syscall (22, FreeBSD ELF64, sys_unmount), rip = 0x80191c72a, rsp =
> 0x7fffffffc4c8, rbp = 0x7fffffffc960 ---
>
> When imported and mounted, the zpool reports no errors and I have not
> experienced any anomalies reading or writing to the zpool. Should I be
> concerned?
>
> Hmm, seems like I'm the only one interested in this.

I think I have resolved the problem with the zpool on the external USB HDD.
I copied off the data, destroyed the datasets and zpool, recreated the
zpool and datasets and restored the data. I have executed several exports
on the zpool and have received no "lock order reversal" messages.

I am still getting lock order reversal messages for my bootpool and zroot
zpools at shutdown. Recreating bootpool would be relatively easy, but if
I'm going to recreate zroot, I might as well do a reinistall, am I am not
prepared to do at this time. So I'm still in search of an alternative way
to resolve this issue. My zpools all show "no known data errors", scrub
cleanly, zpool upgrade reports that all zpools support all features, and
I've not experienced any other issues related to my zpools.

Based on the above lock order reversal data, the error is in this block of
code (, specifically /usr/src/sys/kern/vfs_mount.c:1237)

 if ((coveredvp = mp->mnt_vnodecovered) != NULL) {
                mnt_gen_r = mp->mnt_gen;
                VI_LOCK(coveredvp);
                vholdl(coveredvp);
                vn_lock(coveredvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_RETRY);
   /* LINE 1237 */
                vdrop(coveredvp);
                /*
                 * Check for mp being unmounted while waiting for the
                 * covered vnode lock.
                 */
                if (coveredvp->v_mountedhere != mp ||
                    coveredvp->v_mountedhere->mnt_gen != mnt_gen_r) {
                        VOP_UNLOCK(coveredvp, 0);
                        return (EBUSY);
                }
}

-Tom