On Sun, 14 Mar 2004, Don Lewis wrote: > I cvsup'ed a few hours ago and did the buildworld and buildkernel drill. > I installed the new kernel, rebooted, and got this panic while mtree was > running in installworld. > > -------------------------------------------------------------- > >>> Making hierarchy > -------------------------------------------------------------- > cd /usr/src; /usr/obj/usr/src/make.i386/make -f Makefile.inc1 hierarchy > cd /usr/src/etc; /usr/obj/usr/src/make.i386/make distrib-dirs > mtree -deU -f /usr/src/etc/mtree/BSD.root.dist -p / > > panic: mutex Giant not owned at /usr/src/sys/kern/vfs_subr.c:899 > at line 719 in file /usr/src/sys/kern/kern_mutex.c > cpuid = 0; > Debugger("panic") > Stopped at Debugger+0x46: xchgl %ebx,in_Debugger.0 > db> tr > Debugger(c07d2da1) at Debugger+0x46 > __panic(c07d217c,2cf,c07d22e8,c07d2423,c07daecf) at __panic+0x13d > _mtx_assert(c0893860,1,c07daecf,383,c68931b0) at _mtx_assert+0xc2 > vinvalbuf(c6893104,1,0,c6872690,0,0) at vinvalbuf+0x25 > vclean(c6893104,8,c6872690,c6893104,c6893104) at vclean+0x97 > vgonel(c6893104,c6872690,c6893104,0,c07daecf) at vgonel+0x4d > vgone(c6893104) at vgone+0x28 > pfs_exit(0,c6871898,c648c68c,0,c07d0695) at pfs_exit+0x3f > exit1(c6872690,0,e7b11d40,c076bca7,c6872690) at exit1+0x2bd > exit1(c6872690,e7b11d14,1,11,296) at exit1 > syscall(2f,2f,2f,4814a740,bfbfea48) at syscall+0x217 > Xint0x80_syscall() at Xint0x80_syscall+0x1d > --- syscall (1, FreeBSD ELF32, sys_exit), eip = 0x480c5163, esp = 0xbfbfe9ac, ebp = 0xbfbfe9c8 --- vgone() has interesting locking problems. It seems to be a fundamentally broken interface that only works if the kernel is not premptible or possibly if everything related to vgone() is locked by Giant. Apparently the lower levels know that Giant locking is needed. Look at this code in ufs_mknod9): % /* % * Remove inode, then reload it through VFS_VGET so it is % * checked to see if it is an alias of an existing entry in % * the inode cache. % */ % vput(*vpp); Here we've completely released the vnode, so references to it before we vget() it again use a garbage pointer, since we may be preempted and the vnode may be recycled. % (*vpp)->v_type = VNON; Here we reference it. % ino = ip->i_number; /* Save this before vgone() invalidates ip. */ Here we reference something hanging off it. % vgone(*vpp); Here we do considerably more with it, without holding any lock on it to begin with. % error = VFS_VGET(ap->a_dvp->v_mount, ino, LK_EXCLUSIVE, vpp); This and the rest is OK. % if (error) { % *vpp = NULL; % return (error); % } % return (0); % this code in I noticed this not due to losing a race, but because I have active code in vput() that recycles certain unwanted vnodes as soon as possible (actually sooner than possible due to the bug). The vnode gets vgone()'ed in vput() and comes back with type VBAD (so perhaps I'm mistaken that it was completely free). Then vgone()'ing it again does bad things (deadlock in deadfs). I work around this using: %%% Index: ufs_vnops.c =================================================================== RCS file: /home/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v retrieving revision 1.238 diff -u -2 -r1.238 ufs_vnops.c --- ufs_vnops.c 11 Mar 2004 18:50:33 -0000 1.238 +++ ufs_vnops.c 12 Mar 2004 14:04:04 -0000 _at__at_ -246,7 +258,9 _at__at_ */ vput(*vpp); - (*vpp)->v_type = VNON; ino = ip->i_number; /* Save this before vgone() invalidates ip. */ - vgone(*vpp); + if ((*vpp)->v_type != VBAD) { + (*vpp)->v_type = VNON; + vgone(*vpp); + } error = VFS_VGET(ap->a_dvp->v_mount, ino, LK_EXCLUSIVE, vpp); if (error) { %%% Perhaps the correct fix is to remove vgone() and always use vgonel(). And Giant locking. BruceReceived on Sun Mar 14 2004 - 05:52:07 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:47 UTC