Re: -current lockup (how to diagnose?)

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Tue, 2 Dec 2003 01:45:43 -0500 (EST)
On Tue, 2 Dec 2003, Jun Kuriyama wrote:

> At Mon, 1 Dec 2003 09:23:21 -0500 (EST),
> Robert Watson wrote:
> > This could be a sign of a VM or VFS lock leak or deadlock.  I'd advise
> > hooking up a serial console, dropping to DDB over serial line, and posting
> > the results of "ps" and "show lockedvnods".  We might then ask you to use
> > the "show locks" command on various processes.  You'll need to have DDB
> > and WITNESS compiled in.
> 
> I got it.
> 
> http://www.imgsrc.co.jp/~kuriyama/BSD/lock-20031202.log

"ouch"

Could you try compiling in DEBUG_LOCKS into your kernel and doing "show
lockedvnods" with that?  Unfortunately, someone removed the pid from the
output of that command, but didn't add the thread pointer to the DDB ps
output, so you'll probably need to modify the lockmgr_printinfo() function
in vfs_subr.c to print out lkp->lk_lockholder->td_proc->p_pid as well for
exclusive locks.  It looks like maybe something isn't releasing a vnode
lock before returning to userspace.  I have some patches to assert that no
lockmgr locks are held on the return to userspace, but I'll have to dig
them up tomorrow and send them to you.  Basically, it adds a per-thread
lockmgr lock count in a thread-local variable, incrementing for each lock,
and decrementing for each release, and then KASSERT()'s in userret that
the variable is 0.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Senior Research Scientist, McAfee Research
Received on Mon Dec 01 2003 - 21:48:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:32 UTC