Something in NFS server calling vrele() not vput()?

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Wed, 2 Apr 2003 15:49:48 -0500 (EST)
Unfortunately, I don't have too much information here.  The scenario is as
follows:

cboss: NFS file/build server
crash2: NFS diskless client

I built world on cboss; I then did installworld in crash2.  I intended to
installworld to a DESTDIR on a local disk on crash2, but I failed to mount
it first, so the installworld had both the source and target in NFS.  When
I realizes this had happened, I proceeded to rm -Rf the DESTDIR tree. 
Shortly thereafter, rm -Rf hung on a vnode lock, and other processes
started to stack up going up the directory tree.  I have a little
debugging information below that may be relevant--show lockedvnods shoes
two directories where the locks are held by rm (0x40a25a0), a later ls
(0xc48bd2d0).  The last there entries are worring because the refcounts on
each of these vnodes is 0, and the VI_FREE flag is set.  Earlier in the
debugging session, the VI_FREE flag wasn't set, so presumably the vnode
was being free'd following a removal (not unlikely with installs, renames,
and removals).  Interestingly, the last three entries in the locked vnode
list were apparently grabbed by the nfs daemon.  Unfortunately, we lost of
the pid entry in the lock structure so I can't tell if the thread pointer
is stale and the struct thread has been reused or not.  I suspect given
that the nfsd thread pool is pretty much static that the locks were indeed
grabbed by NFS, so some NFS operation may be calling vrele() instead of
vput() (or the like).   Alternatively, perhaps there's a race somewhere
during ufs_inactive() between it and an NFS operation?

Any other thoughts would be welcome; unfortunately, no core dump is
available. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Network Associates Laboratories

db> show lockedvnods
Locked vnodes
0xc5a2bc8c: tag ufs, type VDIR, usecount 3, writecount 0, refcount 1,
flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc48bd2d0
        ino 619588, on dev ad0s1g (4, 18)
0xc4fbd6d8: tag ufs, type VDIR, usecount 4, writecount 0, refcount 1,
flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc40a25a0 with 1 pending
        ino 1129851, on dev ad0s1g (4, 18)
0xc4d456d8: tag ufs, type VREG, usecount 1, writecount 0, refcount 0,
flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc3f78c30 with 1 pending
        ino 1129935, on dev ad0s1g (4, 18)
0xc5046248: tag ufs, type VREG, usecount 0, writecount 0, refcount 0,
flags (VI_FREE|VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc3f78c30
        ino 1215822, on dev ad0s1g (4, 18)
0xc41fe920: tag ufs, type VREG, usecount 0, writecount 0, refcount 0,
flags (VI_FREE|VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc3f78c30
        ino 1216038, on dev ad0s1g (4, 18)
db> cont
 -> 
cboss# 

db> trace 13805
mi_switch(c40a25a0,50,c5c4536c,dba91c40,0) at mi_switch+0x181
msleep(c4d45794,c058a2a0,50,c04fabf4,0) at msleep+0x43c
acquire(dba919d0,1000040,600,689bd73c,c40a25a0) at acquire+0xa0
lockmgr(c4d45794,1010002,c4d456d8,c40a25a0,dba919ec) at lockmgr+0x3f7
vop_stdlock(dba91a14,dba919f8,c0440778,dba91a14,dba91a38) at
vop_stdlock+0x2c
vop_defaultop(dba91a14,dba91a38,c036629e,dba91a14,c4603e10) at
vop_defaultop+0x1
8
ufs_vnoperate(dba91a14,c4603e10,dba91a5c,c043cb34,2) at ufs_vnoperate+0x18
vn_lock(c4d456d8,10002,c40a25a0,c034ca3a,c5bd7de2) at vn_lock+0x11e
vget(c4d456d8,2,c40a25a0,1064428,c40a25a0) at vget+0x100
vfs_cache_lookup(dba91b54,dba91b80,c0352122,dba91b54,20002) at
vfs_cache_lookup+
0x1ed
ufs_vnoperate(dba91b54,20002,c40a25a0,dba91b0c,c40a25a0) at
ufs_vnoperate+0x18
lookup(dba91c18,c46f1800,400,dba91c34,c40a25a0) at lookup+0x302
namei(dba91c18,80bd948,60,0,c40a25a0) at namei+0x20b
lstat(c40a25a0,dba91d10,8,c40a25a0,2) at lstat+0x52
syscall(2f,2f,2f,80bda00,80b7040) at syscall+0x2aa
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (190, FreeBSD ELF32, lstat), eip = 0x804b45f, esp =
0xbfbff53c, ebp 
= 0xbfbff5c8 ---

(kgdb) inspect ((struct thread *)0xc3f78c30)->td_proc.p_pid
$1 = 404
(kgdb) inspect ((struct thread *)0xc3f78c30)->td_proc.p_comm
$2 = "nfsd\0er", '\0' <repeats 12 times>
Received on Wed Apr 02 2003 - 10:49:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:02 UTC