Re: nfs not reconnecting?

From: Mohan Srinivasan <mohan_srinivasan_at_yahoo.com>
Date: Sat, 30 Apr 2005 19:03:35 -0700 (PDT)
Hi,

This is a bug that crept into -current - nfs_timer() does not 
reconnect in the NFS/TCP case. The bug was introduced in the 
rewrite of the NFS client's reply handling. Before that change, 
one of the processes waiting on the server would loop around
(from nfs_reply()) trying to reconnect. Now, the process 
blocks on "nfsreq" (in nfs_reply()) after transmitting the 
request, with nfs_timer() doing the retransmissions.

I'll work on fixing this next week. The fix is a bit involved,
since we can't do the reconnect from nfs_timer() directly.

In the meantime, would it be possible to collect a core when 
the client gets into this state, (just to confirm this) ?

In the meantime, you could use NFS/UDP to work around this 
issue.

thanks

mohan

--- Jonathan Noack <noackjr_at_alumni.rice.edu> wrote:
> Since upgrading to CURRENT a few weeks ago I've noticed my machine 
> hanging when attempting to reboot.  The machine hits the watchdog 
> timeout during the shutdown procedure because gkrellmd cannot be killed, 
> prompts me to check out "ps -axl", and then hangs after syncing disks. 
> I finally got around to checking it out and it appears some of the 
> threads in gkrellmd are getting stuck in the nfsreq state.
> 
> The machine is an NFS client on a RELENG_5_4 NFS server.  I generally 
> rebuild both machines at the same time for simplicity.  I then reboot 
> the server.  Once the server is back up, I reboot the client.  When the 
> client was running 5.x I saw "nfs server ... not responding" messages 
> during the reboot of the server but as soon as it was back up I got "nfs 
> server ... is alive again".  While running CURRENT I only see "not 
> responding" messages (confirmed by no responses after the day I upgraded 
> to "grep alive /var/log/messages" and "cat /var/log/messages.* | bunzip 
> | grep alive").  I do occasionally see the message, "nfs/tcp clnt: Peer 
> closed connection, tearing down TCP connection".
> 
> Does the NFS client not reconnect in CURRENT at this time?
> 
> -- 
> Jonathan Noack | noackjr_at_alumni.rice.edu | OpenPGP: 0x991D8195
> 
Received on Sun May 01 2005 - 00:03:35 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:33 UTC