Re: Processes stuck in nfsreq

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Thu, 3 Feb 2005 10:07:56 -0800
On Sat, Dec 25, 2004 at 09:35:24AM -0800, Mohan Srinivasan wrote:
> Hi,
> 
> If the box is still in that state, can you force a core ?
> 
> The trace below shows the process blocked on nfs_request()->msleep.
> If the process shows up as waiting on "nfsreq", it would be msleep'ing
> from nfs_reply(), not from nfs_request(). There's only one msleep()
> call in nfs_request() (nfs_socket.c:1125), but that is for "nqnfstry".
> The msleep("nfsreq") only happens from nfs_reply() (unless nfs_reply() 
> has gotten inlined).

OK, I now have working online kgdb.  Here's the trace from one stuck
cvs process:

#0  sched_switch (td=0xc3add2e0, newtd=0xc81c2a10, flags=1) at /usr/src/sys/kern/sched_4bsd.c:964
#1  0xc0517246 in mi_switch (flags=1, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:356
#2  0xc0534624 in sleepq_switch (wchan=0x0) at /usr/src/sys/kern/subr_sleepqueue.c:431
#3  0xc05348f2 in sleepq_wait_sig (wchan=0x0) at /usr/src/sys/kern/subr_sleepqueue.c:556
#4  0xc0516db6 in msleep (ident=0xc57ff500, mtx=0xc07a1040, priority=339, wmesg=0xc0700e35 "nfsreq",
    timo=0) at /usr/src/sys/kern/kern_synch.c:225
#5  0xc05f186a in nfs_reply (rep=0xc57ff500) at /usr/src/sys/nfsclient/nfs_socket.c:600
#6  0xc05f2583 in nfs_request (vp=0xc8b3eac8, mrest=0xc3917200, procnum=4, td=0xc3add2e0,
    cred=0xc512c380, mrp=0xee1a397c, mdp=0xee1a3980, dposp=0xee1a3984)
    at /usr/src/sys/nfsclient/nfs_socket.c:1043
#7  0xc05f7681 in nfs3_access_otw (vp=0xc8b3eac8, wmode=-300271216, td=0xc3add2e0, cred=0xc512c380)
    at /usr/src/sys/nfsclient/nfs_vnops.c:264
#8  0xc05f7f34 in nfs_getattr (ap=0xee1a3a88) at /usr/src/sys/nfsclient/nfs_vnops.c:592
#9  0xc06ce1a0 in VOP_GETATTR_AP (a=0x0) at vnode_if.c:484
#10 0xc05f89dc in nfs_lookup (ap=0x0) at vnode_if.h:275
#11 0xc06cdca0 in VOP_LOOKUP_AP (a=0x0) at vnode_if.c:89
#12 0xc0570489 in lookup (ndp=0xee1a3c28) at vnode_if.h:54
#13 0xc056fdb8 in namei (ndp=0xee1a3c28) at /usr/src/sys/kern/vfs_lookup.c:189
#14 0xc057e102 in stat (td=0xc3add2e0, uap=0xee1a3d14) at /usr/src/sys/kern/vfs_syscalls.c:2088
#15 0xc06b32d0 in syscall (frame=
      {tf_fs = 47, tf_es = 674955311, tf_ds = -1078001617, tf_edi = 3, tf_esi = 3, tf_ebp = -1077943176, tf_isp = -300270220, tf_ebx = 674917772, tf_edx = 135074816, tf_ecx = 2, tf_eax = 188, tf_trapno = 12, tf_err = 2, tf_eip = 674368127, tf_cs = 31, tf_eflags = 514, tf_esp = -1077943860, tf_ss = 47})
    at /usr/src/sys/i386/i386/trap.c:951
#16 0xc069dfcf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
#17 0x0000002f in ?? ()
(kgdb) frame 6
#6  0xc05f2583 in nfs_request (vp=0xc8b3eac8, mrest=0xc3917200, procnum=4, td=0xc3add2e0,
    cred=0xc512c380, mrp=0xee1a397c, mdp=0xee1a3980, dposp=0xee1a3984)
    at /usr/src/sys/nfsclient/nfs_socket.c:1043
1043                    error = nfs_reply(rep);
(kgdb) print rep
$3 = (struct nfsreq *) 0xc57ff500
(kgdb) print *rep
$4 = {r_chain = {tqe_next = 0xc7b23200, tqe_prev = 0xc07a0ff0}, r_mreq = 0xc37d8700, r_mrep = 0x0,
  r_md = 0x0, r_dpos = 0xdeadc0de "", r_nmp = 0xc39d0000, r_vp = 0xc8b3eac8, r_xid = 2829996295,
  r_flags = 1, r_retry = 10, r_rexmit = 0, r_timer = -559038242, r_procnum = 4, r_rtt = -1,
  r_lastmsg = 4245, r_td = 0xc3add2e0}
(kgdb)

The deadc0de is interesting.  Let me know if there's more debugging
info I can get for you.

Kris

Received on Thu Feb 03 2005 - 17:07:58 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:27 UTC