Hi, In the info you have below, the NFS request was not sent (R_SENT is not set). We didn't send out the request because possibly nm_cwnd didn't allow us to send the request. If nfs_request() could not send the request, then typically, the request would be retransmitted from nfs_timer(). If nm_cwnd is full, then there must be a lot of outstanding requests that haven't been replied to yet (on this mount), which is the fundamental issue. Can you force a core dump and send me a pointer to it ? Also, after you force a core, can you also try a quick workaround - someone else also reported NFS client hangs and said that things were fine after they set mpsafenet to 0. It would be good to see if there's a correlation there. mohan --- Kris Kennaway <kris_at_obsecurity.org> wrote: > On Sat, Dec 25, 2004 at 09:35:24AM -0800, Mohan Srinivasan wrote: > > Hi, > > > > If the box is still in that state, can you force a core ? > > > > The trace below shows the process blocked on nfs_request()->msleep. > > If the process shows up as waiting on "nfsreq", it would be msleep'ing > > from nfs_reply(), not from nfs_request(). There's only one msleep() > > call in nfs_request() (nfs_socket.c:1125), but that is for "nqnfstry". > > The msleep("nfsreq") only happens from nfs_reply() (unless nfs_reply() > > has gotten inlined). > > OK, I now have working online kgdb. Here's the trace from one stuck > cvs process: > > #0 sched_switch (td=0xc3add2e0, newtd=0xc81c2a10, flags=1) at > /usr/src/sys/kern/sched_4bsd.c:964 > #1 0xc0517246 in mi_switch (flags=1, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:356 > #2 0xc0534624 in sleepq_switch (wchan=0x0) at /usr/src/sys/kern/subr_sleepqueue.c:431 > #3 0xc05348f2 in sleepq_wait_sig (wchan=0x0) at /usr/src/sys/kern/subr_sleepqueue.c:556 > #4 0xc0516db6 in msleep (ident=0xc57ff500, mtx=0xc07a1040, priority=339, wmesg=0xc0700e35 > "nfsreq", > timo=0) at /usr/src/sys/kern/kern_synch.c:225 > #5 0xc05f186a in nfs_reply (rep=0xc57ff500) at /usr/src/sys/nfsclient/nfs_socket.c:600 > #6 0xc05f2583 in nfs_request (vp=0xc8b3eac8, mrest=0xc3917200, procnum=4, td=0xc3add2e0, > cred=0xc512c380, mrp=0xee1a397c, mdp=0xee1a3980, dposp=0xee1a3984) > at /usr/src/sys/nfsclient/nfs_socket.c:1043 > #7 0xc05f7681 in nfs3_access_otw (vp=0xc8b3eac8, wmode=-300271216, td=0xc3add2e0, > cred=0xc512c380) > at /usr/src/sys/nfsclient/nfs_vnops.c:264 > #8 0xc05f7f34 in nfs_getattr (ap=0xee1a3a88) at /usr/src/sys/nfsclient/nfs_vnops.c:592 > #9 0xc06ce1a0 in VOP_GETATTR_AP (a=0x0) at vnode_if.c:484 > #10 0xc05f89dc in nfs_lookup (ap=0x0) at vnode_if.h:275 > #11 0xc06cdca0 in VOP_LOOKUP_AP (a=0x0) at vnode_if.c:89 > #12 0xc0570489 in lookup (ndp=0xee1a3c28) at vnode_if.h:54 > #13 0xc056fdb8 in namei (ndp=0xee1a3c28) at /usr/src/sys/kern/vfs_lookup.c:189 > #14 0xc057e102 in stat (td=0xc3add2e0, uap=0xee1a3d14) at /usr/src/sys/kern/vfs_syscalls.c:2088 > #15 0xc06b32d0 in syscall (frame= > {tf_fs = 47, tf_es = 674955311, tf_ds = -1078001617, tf_edi = 3, tf_esi = 3, tf_ebp = > -1077943176, tf_isp = -300270220, tf_ebx = 674917772, tf_edx = 135074816, tf_ecx = 2, tf_eax = > 188, tf_trapno = 12, tf_err = 2, tf_eip = 674368127, tf_cs = 31, tf_eflags = 514, tf_esp = > -1077943860, tf_ss = 47}) > at /usr/src/sys/i386/i386/trap.c:951 > #16 0xc069dfcf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 > #17 0x0000002f in ?? () > (kgdb) frame 6 > #6 0xc05f2583 in nfs_request (vp=0xc8b3eac8, mrest=0xc3917200, procnum=4, td=0xc3add2e0, > cred=0xc512c380, mrp=0xee1a397c, mdp=0xee1a3980, dposp=0xee1a3984) > at /usr/src/sys/nfsclient/nfs_socket.c:1043 > 1043 error = nfs_reply(rep); > (kgdb) print rep > $3 = (struct nfsreq *) 0xc57ff500 > (kgdb) print *rep > $4 = {r_chain = {tqe_next = 0xc7b23200, tqe_prev = 0xc07a0ff0}, r_mreq = 0xc37d8700, r_mrep = > 0x0, > r_md = 0x0, r_dpos = 0xdeadc0de "", r_nmp = 0xc39d0000, r_vp = 0xc8b3eac8, r_xid = 2829996295, > r_flags = 1, r_retry = 10, r_rexmit = 0, r_timer = -559038242, r_procnum = 4, r_rtt = -1, > r_lastmsg = 4245, r_td = 0xc3add2e0} > (kgdb) > > The deadc0de is interesting. Let me know if there's more debugging > info I can get for you. > > Kris > > ATTACHMENT part 2 application/pgp-signatureReceived on Thu Feb 03 2005 - 17:29:17 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:27 UTC