Re: panic: System call lstat returning with 1 locks held

From: Attilio Rao <attilio_at_freebsd.org>
Date: Thu, 31 Jan 2008 11:43:10 +0100
2008/1/30, Scot Hetzel <swhetzel_at_gmail.com>:
> On 1/30/08, Attilio Rao <attilio_at_freebsd.org> wrote:
>  > 2008/1/30, Yar Tikhiy <yar_at_comp.chem.msu.su>:
>  > > On Tue, Jan 29, 2008 at 11:11:13PM +0100, Attilio Rao wrote:
>  > > >
>  > > > I'm committing my WITNESS patch now to perforce so that other people
>  > > > can hopefully stress-test it before to be committed.
>  > >
>  > > Do you think that that patch is applicable in my case?  I.e., shall
>  > > I use it to get more debug info on my panics?
>  > >
>  > > If so, where is the patched file in the depot?
>  >
>  > Sorry but I had to delay the operation so far.
>  > In the end, a suitable patch is located here:
>  > http://www.freebsd.org/~attilio/witness_lockmgr.diff
>  >
>  > I tried it and it alredy reported 4 LORs just when booting the kernel :)
>  > So I would expect reasonably LOR cascades with this patch.
>  >
>  > If you all 3 (Scot, Yar and Doug) could try and test it I would
>  > appreciate a lot.
>  >
>
> Reading back to Doug's and Yar's messages regarding the NTFS
>  filesystem, I noticed that I am also mounting NTFS filesystems at boot
>  time.  I disabled the mounting of the NTFS filesystems.  When 'cd
>  /usr/ports ; find . -print' or '/usr/local/etc/cvsup/update.sh' is
>  run, the panic doesn't occur.
>
>  But when I mount the NTFS filesystem, and rerun the above commands,
>  they cause the lstat panic.  Even though these commands are not
>  touching the NTFS filesystems.
>
>  Also mounting/unmounting a NTFS filesystem will cause a panic.
>
>  I applied the above patch to sources that were checked out about 2 hrs
>  ago.  Rebuilt/installed kernel and rebooted.
>
>  If I don't mount a NTFS filesystem then the kernel doesn't panic when
>  the above commands are run.

So it seems NTFS is definitively busted.

>  But when the NTFS filesystem is mounted, the following lock order
>  reversal occurs:
>
>  lock order reversal:
>   1st 0xffffff0023285288 pseudofs (pseudofs) _at_ kern/vfs_subr.c:2061
>   2nd 0xffffff00232f2ca0 vfslock (vfslock) _at_ kern/vfs_subr.c:364
>
> KDB: stack backtrace:
>  db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
>  witness_checkorder() at witness_checkorder+0x606
>
> _lockmgr() at _lockmgr+0x4cb
>  vfs_busy() at vfs_busy+0xdf
>  vfs_donmount() at vfs_donmount+0x9aa
>  nmount() at nmount+0xa4
>
> syscall() at syscall+0x1ce
>  Xfast_syscall() at Xfast_syscall+0xab
>
> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8
>  28, rbp = 0x65a9d0 ---
>  lock order reversal:
>   1st 0xffffff002347f668 ntfs (ntfs) _at_ kern/vfs_subr.c:2061
>   2nd 0xffffff00232f2650 vfslock (vfslock) _at_ kern/vfs_subr.c:364
>
> KDB: stack backtrace:
>  db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
>  witness_checkorder() at witness_checkorder+0x606
>
> _lockmgr() at _lockmgr+0x4cb
>  vfs_busy() at vfs_busy+0xdf
>  vfs_donmount() at vfs_donmount+0x9aa
>  nmount() at nmount+0xa4
>
> syscall() at syscall+0x1ce
>  Xfast_syscall() at Xfast_syscall+0xab
>
> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8
>  28, rbp = 0x65ad80 ---
>
>  Instead of getting the lstat panic, I am now getting the following
>  panic when /usr/local/etc/cvsup/update.sh ran:
>
>  Fatal trap 9: general protection fault while in kernel mode
>  cpuid = 0; apic id = 00
>  instruction pointer     = 0x8:0xffffffff80301051
>  stack pointer           = 0x10:0xffffffffd6bb0100
>  frame pointer           = 0x10:0xffffffffd6bb0190
>  code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>  processor eflags        = resume, IOPL = 0
>  current process         = 1243 (cvsup)
>  panic: Assertion !mtx_owned(&w_mtx) failed at ../../../kern/subr_witness.c:959
>  cpuid = 0
>  Uptime: 11m14s
>  Physical memory: 2031 MB
>  Dumping 325 MB: 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54
>   38 22 6

The assertion failing should not happen now.
Could you please hand-add a check in _lockmgr_disown()
(kern/kern_lock.c) in order to check for the panicstr before to call
WITNESS? I cannot access to perforce now and produce a suitable diff,
so you can just do this by hand:

if (lkp->lk_lockholder == td) {
        if (panicstr != NULL)
                WITNESS_UNLOCK(&lkp->lk_object, LOP_EXCLUSIVE, file, line);
        td->td_locks--;
}


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Thu Jan 31 2008 - 09:43:13 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:26 UTC