NFSv3 + 8.1 + rpc.[lockd|statd] issues

From: Eric Crist <ecrist_at_secure-computing.net>
Date: Wed, 13 Oct 2010 07:42:19 -0500
Hey folks,

We have a machine running FreeBSD 8.1-RELEASEp1 acting as an NFS server hosting 3 ZFS file systems on an external enclosure. There are a bunch of machines, ranging from 4.11, 7.1, and 8.x systems acting as NFS clients to this server.  Running dmesg on the NFS server shows no errors at all, but the three different clients show differing errors.  On the panther example below, it was reported last night a 48MB file took about 90 minutes to transfer.

I'm working on upgrading the 7.1 system to 8.1 now, so I'm not quite as concerned with that, but the rpcbind errors that show on both 7.1 and 8.1 are causing core dumps on some of our applications.

Any help is appreciated.

=== Data below ===

ecrist_at_jaguar-1:~-> date
Wed Oct 13 07:33:46 CDT 2010
ecrist_at_jaguar-1:~-> dmesg
ecrist_at_jaguar-1:~-> uname -a
FreeBSD jaguar-1.claimlynx.com 8.1-RC2 FreeBSD 8.1-RC2 #1: Wed Jul 14 11:34:02 CDT 2010     root_at_jaguar-1.claimlynx.com:/usr/obj/usr/src/sys/GENERIC-CARP  amd64
ecrist_at_jaguar-1:~-> uptime
 7:33AM  up 83 days,  8:42, 2 users, load averages: 1.08, 1.25, 0.94
ecrist_at_jaguar-1:~->

On the clients, however, many of them are reporting assorted problems.  The 7.1 system reports the following:

ecrist_at_panther:~-> date
Wed Oct 13 07:34:43 CDT 2010
ecrist_at_panther:~-> dmesg
...
Can't start NLM - unable to contact NSM
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
nfs server jaguar.stor:/array/production: not responding
nfs server jaguar.stor:/array/production: is alive again
nfs server jaguar.stor:/array/production: not responding
...
ecrist_at_panther:~-> uname -a
FreeBSD panther.claimlynx.com 7.1-RELEASE-p3 FreeBSD 7.1-RELEASE-p3 #2: Sun Mar 22 08:21:50 CDT 2009     root_at_cougar.claimlynx.com:/usr/obj/usr/src/sys/SMP-ASR  i386
ecrist_at_panther:~-> uptime
 7:34AM  up 30 days, 16:13, 4 users, load averages: 0.97, 1.00, 0.91
ecrist_at_panther:~-> 

Our 4.11 system:
ecrist_at_puma:~-> date
Wed Oct 13 07:38:09 CDT 2010
ecrist_at_puma:~-> dmesg
got bad cookie vp 0xe93fd240 bp 0xcfa2d2ec
got bad cookie vp 0xe859e740 bp 0xcfa96644
...
nfs server jaguar.stor:/array/production: not responding
nfs server jaguar.stor:/array/production: is alive again
nfs server jaguar.stor:/array/archive: not responding
nfs server jaguar.stor:/array/archive: is alive again
nfs server jaguar.stor:/array/archive: not responding
nfs server jaguar.stor:/array/archive: is alive again
nfs server jaguar.stor:/array/archive: not responding
nfs server jaguar.stor:/array/production: not responding
nfs server jaguar.stor:/array/archive: is alive again
nfs server jaguar.stor:/array/production: is alive again
nfs server jaguar.stor:/array/archive: not responding
nfs server jaguar.stor:/array/archive: is alive again
nfs server jaguar.stor:/array/production: not responding
nfs server jaguar.stor:/array/production: is alive again
nfs server jaguar.stor:/array/production: not responding
...
ecrist_at_puma:~-> uname -a
FreeBSD puma.claimlynx.com 4.11-RELEASE-p2 FreeBSD 4.11-RELEASE-p2 #1: Wed Apr 13 18:25:25 CDT 2005     drue_at_puma.claimlynx.com:/usr/obj/usr/src/sys/PUMA  i386
ecrist_at_puma:~-> uptime
 7:38AM  up 30 days, 15:27, 1 user, load averages: 0.02, 0.02, 0.00
ecrist_at_puma:~-> 

And, finally, an 8.1 system:

ecrist_at_puma-2:~-> date
Wed Oct 13 07:39:27 CDT 2010
ecrist_at_puma-2:~-> dmesg
...
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
ecrist_at_puma-2:~-> uname -a
FreeBSD puma-2.claimlynx.com 8.1-RELEASE FreeBSD 8.1-RELEASE #2: Mon Aug  2 12:50:40 CDT 2010     root_at_jaguar-1.claimlynx.com:/usr/obj/usr/src/sys/GENERIC-CARP  amd64
ecrist_at_puma-2:~-> uptime
 7:39AM  up 70 days, 18:25, 3 users, load averages: 0.00, 0.00, 0.00
ecrist_at_puma-2:~->
Received on Wed Oct 13 2010 - 11:07:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC