Re: Is a high witness refcount indicative of a missing unlock?

From: Lars <lars_at_odin-corporation.com>
Date: Tue, 31 Mar 2015 16:17:19 -0500
Hi Shane,
While our configs shoulds much the same (ignoring 10.1-stable vs current) and I had the nvidia driver loaded, my lockups did not involve the nvidia driver. That of course does not necessarily mean anything if the issue is squarly in zfs somewhere.

You can see the reference counts from the ddb kernel debugger (man 8 ddb) using the “show witness” command

Lars
> On Mar 29, 2015, at 19:14, Shane Ambler <FreeBSD_at_ShaneWare.Biz> wrote:
> 
> On 30/03/2015 05:59, Lars wrote:
>> Hi, I am poking around for a cause for my repeating deadlock issues
>> on my system based on r 279869. ddb show witness show the “vnode
>> interlock” and the “zfs” locks both with reference counts over
>> 200K. Obviously they are related, and there is a find running (all
>> the filesystems on this machine are zfs ( minus the specialty ones
>> like devfs).
>> 
>> I don’t see any other withness entry with reference counts even in
>> the ballpark of these two, so does this indicate that we have a
>> vnode/zfs path were we don’t unlock?
>> 
> 
> I am running 10.1-STABLE and have bad locking issues. Running a witness
> kernel I got a duplicate lock from nvidia and lock order reversals
> involving zfs. Any chance your issue is related to mine?
> 
> What command can give me the witness lock counts?
> 
> The debug data I have collected so far is at -
> http://shaneware.biz/freebsddebugdata/
> 
> The lock reversal output I had was (after uptime of about 12 mins) -
> 
> Mar 24 00:24:25 leader kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done
> Mar 24 00:24:25 leader kernel: Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
> Mar 24 00:24:25 leader kernel: Waiting (max 60 seconds) for system process `syncer' to stop...
> Mar 24 00:24:25 leader kernel: Syncing disks, vnodes remaining...0 0 0 0 0 0 0 0 done
> Mar 24 00:24:25 leader kernel: All buffers synced.
> Mar 24 00:24:25 leader kernel: lock order reversal:
> Mar 24 00:24:25 leader kernel: 1st 0xfffff800224555f0 zfs (zfs) _at_ /usr/src/sys/kern/vfs_mount.c:1229
> Mar 24 00:24:25 leader kernel: 2nd 0xfffff800222d67c8 syncer (syncer) _at_ /usr/src/sys/kern/vfs_subr.c:2268
> Mar 24 00:24:25 leader kernel: KDB: stack backtrace:
> Mar 24 00:24:25 leader kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe022df6e4c0
> Mar 24 00:24:25 leader kernel: kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe022df6e570
> Mar 24 00:24:25 leader kernel: witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe022df6e600
> Mar 24 00:24:25 leader kernel: __lockmgr_args() at __lockmgr_args+0x9ea/frame 0xfffffe022df6e740
> Mar 24 00:24:25 leader kernel: vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe022df6e760
> Mar 24 00:24:25 leader kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfc/frame 0xfffffe022df6e790
> Mar 24 00:24:25 leader kernel: _vn_lock() at _vn_lock+0xaa/frame 0xfffffe022df6e800
> Mar 24 00:24:25 leader kernel: vputx() at vputx+0x232/frame 0xfffffe022df6e860
> Mar 24 00:24:25 leader kernel: dounmount() at dounmount+0x301/frame 0xfffffe022df6e8e0
> Mar 24 00:24:25 leader kernel: vfs_unmountall() at vfs_unmountall+0x61/frame 0xfffffe022df6e910
> Mar 24 00:24:25 leader kernel: kern_reboot() at kern_reboot+0x540/frame 0xfffffe022df6e980
> Mar 24 00:24:25 leader kernel: sys_reboot() at sys_reboot+0x5a/frame 0xfffffe022df6e9a0
> Mar 24 00:24:25 leader kernel: amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe022df6eab0
> Mar 24 00:24:25 leader kernel: Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe022df6eab0
> Mar 24 00:24:25 leader kernel: --- syscall (55, FreeBSD ELF64, sys_reboot), rip = 0x40f1bc, rsp = 0x7fffffffe6d8, rbp = 0x7fffffffe7d0 ---
> Mar 24 00:24:25 leader kernel: lock order reversal:
> Mar 24 00:24:25 leader kernel: 1st 0xfffff800222d6b78 zfs (zfs) _at_ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1814
> Mar 24 00:24:25 leader kernel: 2nd 0xffffffff818514a8 allproc (allproc) _at_ /usr/src/sys/kern/kern_descrip.c:2872
> Mar 24 00:24:25 leader kernel: KDB: stack backtrace:
> Mar 24 00:24:25 leader kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe022df6e690
> Mar 24 00:24:25 leader kernel: kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe022df6e740
> Mar 24 00:24:25 leader kernel: witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe022df6e7d0
> Mar 24 00:24:25 leader kernel: _sx_slock() at _sx_slock+0x76/frame 0xfffffe022df6e810
> Mar 24 00:24:25 leader kernel: mountcheckdirs() at mountcheckdirs+0x47/frame 0xfffffe022df6e860
> Mar 24 00:24:25 leader kernel: dounmount() at dounmount+0x36f/frame 0xfffffe022df6e8e0
> Mar 24 00:24:25 leader kernel: vfs_unmountall() at vfs_unmountall+0x61/frame 0xfffffe022df6e910
> Mar 24 00:24:25 leader kernel: kern_reboot() at kern_reboot+0x540/frame 0xfffffe022df6e980
> Mar 24 00:24:25 leader kernel: sys_reboot() at sys_reboot+0x5a/frame 0xfffffe022df6e9a0
> Mar 24 00:24:25 leader kernel: amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe022df6eab0
> Mar 24 00:24:25 leader kernel: Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe022df6eab0
> Mar 24 00:24:25 leader kernel: --- syscall (55, FreeBSD ELF64, sys_reboot), rip = 0x40f1bc, rsp = 0x7fffffffe6d8, rbp = 0x7fffffffe7d0 ---
> Mar 24 00:24:25 leader kernel: lock order reversal:
> Mar 24 00:24:25 leader kernel: 1st 0xfffff8001ca8e240 zfs (zfs) _at_ /usr/src/sys/kern/vfs_mount.c:1229
> Mar 24 00:24:25 leader kernel: 2nd 0xfffff8001ca8e5f0 devfs (devfs) _at_ /usr/src/sys/kern/vfs_subr.c:2157
> Mar 24 00:24:25 leader kernel: KDB: stack backtrace:
> Mar 24 00:24:25 leader kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe022df6e460
> Mar 24 00:24:25 leader kernel: kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe022df6e510
> Mar 24 00:24:25 leader kernel: witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe022df6e5a0
> Mar 24 00:24:25 leader kernel: __lockmgr_args() at __lockmgr_args+0x9ea/frame 0xfffffe022df6e6e0
> Mar 24 00:24:25 leader kernel: vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe022df6e700
> Mar 24 00:24:25 leader kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfc/frame 0xfffffe022df6e730
> Mar 24 00:24:25 leader kernel: _vn_lock() at _vn_lock+0xaa/frame 0xfffffe022df6e7a0
> Mar 24 00:24:25 leader kernel: vget() at vget+0x67/frame 0xfffffe022df6e7e0
> Mar 24 00:24:25 leader kernel: devfs_allocv() at devfs_allocv+0xfd/frame 0xfffffe022df6e830
> Mar 24 00:24:25 leader kernel: devfs_root() at devfs_root+0x43/frame 0xfffffe022df6e860
> Mar 24 00:24:25 leader kernel: dounmount() at dounmount+0x345/frame 0xfffffe022df6e8e0
> Mar 24 00:24:25 leader kernel: vfs_unmountall() at vfs_unmountall+0x61/frame 0xfffffe022df6e910
> Mar 24 00:24:25 leader kernel: kern_reboot() at kern_reboot+0x540/frame 0xfffffe022df6e980
> Mar 24 00:24:25 leader kernel: sys_reboot() at sys_reboot+0x5a/frame 0xfffffe022df6e9a0
> Mar 24 00:24:25 leader kernel: amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe022df6eab0
> Mar 24 00:24:25 leader kernel: Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe022df6eab0
> Mar 24 00:24:25 leader kernel: --- syscall (55, FreeBSD ELF64, sys_reboot), rip = 0x40f1bc, rsp = 0x7fffffffe6d8, rbp = 0x7fffffffe7d0 ---
> Mar 24 00:24:25 leader kernel: Uptime: 12m42s
> 
> 
> -- 
> FreeBSD - the place to B...Software Developing
> 
> Shane Ambler
> 
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 
Received on Tue Mar 31 2015 - 19:17:35 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:56 UTC