Re: reproducible "panic: share->excl"

From: Kostik Belousov <kostikbel_at_gmail.com> Date: Tue, 22 Jul 2008 20:05:40 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:33 UTC

On Tue, Jul 22, 2008 at 06:54:04PM +0200, Attilio Rao wrote:
> 2008/7/22, Kostik Belousov <kostikbel_at_gmail.com>:
> > On Mon, Jul 21, 2008 at 05:03:14PM -0400, Andrew Gallatin wrote:
> >  > I can panic today's -current reliably (or hang it with
> >  > WITNESS/INVARIENTS disabled).  When it crashes, I see
> >  > the appended panic messages.
> >  >
> >  > It seems to be 100% reproducible on my box (AMD64 x2,
> >  > 512MB ram, UFS2).  If anybody savvy in this area would
> >  > like to reproduce it, I've left the program at ~gallatin/ahunt.c
> >  > on freefall.  Compile it, and run it as:
> >  > ./a.out -mmbfileinit -madvise=/var/tmp/zot  -random -size=95536
> >  > -touch=4096 -rewrite=2
> >  >
> >  >
> >  > Cheers,
> >  >
> >  > Drew
> >  >
> >  > PS:  Here is a serial console log from the panic:
> >
> > ...
> >
> >
> >  > login: shared lock of (lockmgr) ufs _at_ kern/vfs_subr.c:2044
> >  > while exclusively locked from kern/vfs_vnops.c:593
> >  > panic: share->excl
> >  > cpuid = 1
> >  > KDB: enter: panic
> >  > [thread pid 1702 tid 100149 ]
> >  > Stopped at      kdb_enter+0x3d: movq    $0,0x639958(%rip)
> >  > db> tr
> >  > Tracing pid 1702 tid 100149 td 0xffffff000d08f000
> >  > kdb_enter() at kdb_enter+0x3d
> >  > panic() at panic+0x176
> >  > witness_checkorder() at witness_checkorder+0x137
> >  > __lockmgr_args() at __lockmgr_args+0xc74
> >  > ffs_lock() at ffs_lock+0x8c
> >  > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
> >  > _vn_lock() at _vn_lock+0x47
> >  > vget() at vget+0x7b
> >  > vnode_pager_lock() at vnode_pager_lock+0x146
> >  > vm_fault() at vm_fault+0x1e2
> >  > trap_pfault() at trap_pfault+0x128
> >  > trap() at trap+0x395
> >  > calltrap() at calltrap+0x8
> >  > --- trap 0xc, rip = 0xffffffff8079f2bd, rsp = 0xfffffffe58c2f7b0, rbp =
> >  > 0xfffffffe58c2f830 ---
> >  > copyin() at copyin+0x3d
> >  > ffs_write() at ffs_write+0x2f8
> >  > VOP_WRITE_APV() at VOP_WRITE_APV+0x10b
> >  > vn_write() at vn_write+0x23f
> >  > dofilewrite() at dofilewrite+0x85
> >  > --More--
> >  >
> >  > kern_writev() at kern_writev+0x60
> >  > write() at write+0x54
> >  > syscall() at syscall+0x1dd
> >  > Xfast_syscall() at Xfast_syscall+0xab
> >  > --- syscall (4, FreeBSD ELF64, write), rip = 0x8007296ec, rsp =
> >  > 0x7fffffffe158, rbp = 0x7fffffffe210 ---
> >  > db> show locks
> >  > exclusive sleep mutex vnode interlock r = 0 (0xffffff000d0dc0c0) locked
> >  > _at_ vm/vnode_pager.c:1199
> >  > exclusive sx user map r = 0 (0xffffff000d054360) locked _at_ vm/vm_map.c:3115
> >  > exclusive lockmgr bufwait r = 0 (0xfffffffe5047f278) locked _at_
> >  > kern/vfs_bio.c:1783
> >  > exclusive lockmgr ufs r = 0 (0xffffff000d0dc098) locked _at_
> >  > kern/vfs_vnops.c:593
> >  > db>
> >
> >
> > Essentially, you tried to do the write of the part of the region mmaped
> >  from the file, to the file. The VOP_WRITE() is called with exclusively
> >  locked vnode, while fault handler tried to lock the vnode in shared mode
> >  to page in.
> >
> >  The following change fixed it for me.
> >  Attilio, would it make sense to consider LK_CANRECURSE | LK_SHARED as
> >  a request for the exlusive lock when the current thread already hold the
> >  exclusive lock instead ? I think this would be a proper solution.
> 
> I don't like this kind of magics and ecoding in lockmgr.
> I think that the better thing to do here is to recurse the exclusive
> lock as you pass to vget().
It could be argued that lockmgr is a black magic in whole. On the other
hand, I had to use VOP_ISLOCKED() and manually construct lock request
while all needed information is at hands inside the lockmgr. Moreover,
I believe that doing implicit shared->exclusive request upgrade in
this situation (excl locked by curthread, LK_CANRECURSE present) is
right.

> 
> Also note that without WITNESS the code will return EDEADLK in this
> case while traditionally what would have happened is that the lockmgr
> would have to be downgraded silently, but as you can expect this is a
> very dangerous practice.
Fully agree.