Re: reproducible "panic: share->excl"

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Tue, 22 Jul 2008 18:48:25 +0300
On Mon, Jul 21, 2008 at 05:03:14PM -0400, Andrew Gallatin wrote:
> I can panic today's -current reliably (or hang it with
> WITNESS/INVARIENTS disabled).  When it crashes, I see
> the appended panic messages.
> 
> It seems to be 100% reproducible on my box (AMD64 x2,
> 512MB ram, UFS2).  If anybody savvy in this area would
> like to reproduce it, I've left the program at ~gallatin/ahunt.c
> on freefall.  Compile it, and run it as:
> ./a.out -mmbfileinit -madvise=/var/tmp/zot  -random -size=95536 
> -touch=4096 -rewrite=2
> 
> 
> Cheers,
> 
> Drew
> 
> PS:  Here is a serial console log from the panic:
...

> login: shared lock of (lockmgr) ufs _at_ kern/vfs_subr.c:2044
> while exclusively locked from kern/vfs_vnops.c:593
> panic: share->excl
> cpuid = 1
> KDB: enter: panic
> [thread pid 1702 tid 100149 ]
> Stopped at      kdb_enter+0x3d: movq    $0,0x639958(%rip)
> db> tr
> Tracing pid 1702 tid 100149 td 0xffffff000d08f000
> kdb_enter() at kdb_enter+0x3d
> panic() at panic+0x176
> witness_checkorder() at witness_checkorder+0x137
> __lockmgr_args() at __lockmgr_args+0xc74
> ffs_lock() at ffs_lock+0x8c
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
> _vn_lock() at _vn_lock+0x47
> vget() at vget+0x7b
> vnode_pager_lock() at vnode_pager_lock+0x146
> vm_fault() at vm_fault+0x1e2
> trap_pfault() at trap_pfault+0x128
> trap() at trap+0x395
> calltrap() at calltrap+0x8
> --- trap 0xc, rip = 0xffffffff8079f2bd, rsp = 0xfffffffe58c2f7b0, rbp = 
> 0xfffffffe58c2f830 ---
> copyin() at copyin+0x3d
> ffs_write() at ffs_write+0x2f8
> VOP_WRITE_APV() at VOP_WRITE_APV+0x10b
> vn_write() at vn_write+0x23f
> dofilewrite() at dofilewrite+0x85
> --More--
> 
> kern_writev() at kern_writev+0x60
> write() at write+0x54
> syscall() at syscall+0x1dd
> Xfast_syscall() at Xfast_syscall+0xab
> --- syscall (4, FreeBSD ELF64, write), rip = 0x8007296ec, rsp = 
> 0x7fffffffe158, rbp = 0x7fffffffe210 ---
> db> show locks
> exclusive sleep mutex vnode interlock r = 0 (0xffffff000d0dc0c0) locked 
> _at_ vm/vnode_pager.c:1199
> exclusive sx user map r = 0 (0xffffff000d054360) locked _at_ vm/vm_map.c:3115
> exclusive lockmgr bufwait r = 0 (0xfffffffe5047f278) locked _at_ 
> kern/vfs_bio.c:1783
> exclusive lockmgr ufs r = 0 (0xffffff000d0dc098) locked _at_ 
> kern/vfs_vnops.c:593
> db>

Essentially, you tried to do the write of the part of the region mmaped
from the file, to the file. The VOP_WRITE() is called with exclusively
locked vnode, while fault handler tried to lock the vnode in shared mode
to page in.

The following change fixed it for me.
Attilio, would it make sense to consider LK_CANRECURSE | LK_SHARED as
a request for the exlusive lock when the current thread already hold the
exclusive lock instead ? I think this would be a proper solution.

diff --git a/sys/vm/vnode_pager.c b/sys/vm/vnode_pager.c
index 4758456..61f4fd9 100644
--- a/sys/vm/vnode_pager.c
+++ b/sys/vm/vnode_pager.c
_at__at_ -1179,6 +1179,7 _at__at_ vnode_pager_lock(vm_object_t first_object)
 {
 	struct vnode *vp;
 	vm_object_t backing_object, object;
+	int locked, lockf;
 
 	VM_OBJECT_LOCK_ASSERT(first_object, MA_OWNED);
 	for (object = first_object; object != NULL; object = backing_object) {
_at__at_ -1196,13 +1197,19 _at__at_ vnode_pager_lock(vm_object_t first_object)
 			return NULL;
 		}
 		vp = object->handle;
+		locked = VOP_ISLOCKED(vp);
 		VI_LOCK(vp);
 		VM_OBJECT_UNLOCK(object);
 		if (first_object != object)
 			VM_OBJECT_UNLOCK(first_object);
 		VFS_ASSERT_GIANT(vp->v_mount);
-		if (vget(vp, LK_CANRECURSE | LK_INTERLOCK |
-		    LK_RETRY | LK_SHARED, curthread)) {
+		if (locked == LK_EXCLUSIVE)
+			lockf = LK_CANRECURSE | LK_INTERLOCK | LK_RETRY |
+			    LK_EXCLUSIVE;
+		else
+			lockf = LK_CANRECURSE | LK_INTERLOCK | LK_RETRY |
+			    LK_SHARED;
+		if (vget(vp, lockf, curthread)) {
 			VM_OBJECT_LOCK(first_object);
 			if (object != first_object)
 				VM_OBJECT_LOCK(object);

Received on Tue Jul 22 2008 - 13:48:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:33 UTC