Re: problems with nfsd (due to RPCSEC_GSS changes?)

From: Navdeep Parhar <nparhar_at_gmail.com>
Date: Wed, 12 Nov 2008 11:36:49 -0800
On Wed, Nov 12, 2008 at 1:20 AM, Doug Rabson <dfr_at_rabson.org> wrote:
>
> On 12 Nov 2008, at 01:30, Navdeep Parhar wrote:
>
>> I had a FreeBSD NFS server running a month+ old current (from Oct 2 or
>> so).  I upgraded to a current current (Nov 11) and nfsd stopped working.
>> I was able to mount the exported filesystem but anything else would
>> yield an "Input/output error." nfsstat -s showed "Server Ret-Failed"
>> going up everytime I tried a 'cd', 'ls', etc. from the client.  (I tried
>> both FreeBSD and Solaris clients).
>>
>> Ultimately, I had to add NFS_LEGACYRPC in order to get a working nfsd.
>> Looks like there may be a problem with the new code that was added as
>> part of RPCSEC_GSS support.  Note that I did not enable KGSSAPI in my
>> kernel as I have no need for it.
>>
>> Are there any knows issues with the new code?  Feel free to ask if you
>> need any more information about my setup.
>
> I don't know of anything specific. If I could see a packet trace including
> both the mount request and at least one failed access attempt, it would help
> to understand what is happening here.
>

I saw a handful of commits from you last night so I updated + rebuilt the
server's kernel to include them.  These traces are with today's code (Nov12
11AM Pacific) on the server and yesterday's code on the client.

The server is .2 and the client is .1, the trace is using tcpdump -s
256 -vvn on the server.

# mount /usr/obj  (and then wait a couple of seconds.  The mount succeeds)

11:23:37.947451 IP (tos 0x0, ttl 64, id 20474, offset 0, flags [none],
proto UDP (17), length 84) 192.168.1.1.841 > 192.168.1.2.111: [udp sum
ok] UDP, length 56
11:23:37.947548 IP (tos 0x0, ttl 64, id 1644, offset 0, flags [none],
proto UDP (17), length 56, bad cksum 0 (->f0f5)!) 192.168.1.2.111 >
192.168.1.1.841: [bad udp cksum 268!] UDP, length 28
11:23:37.947691 IP (tos 0x0, ttl 64, id 20475, offset 0, flags [none],
proto UDP (17), length 68) 192.168.1.1.1225996811 > 192.168.1.2.2049:
40 null
11:23:37.947723 IP (tos 0x0, ttl 64, id 1645, offset 0, flags [none],
proto UDP (17), length 52, bad cksum 0 (->f0f8)!) 192.168.1.2.2049 >
192.168.1.1.1225996811: reply ok 24 null
11:23:37.947807 IP (tos 0x0, ttl 64, id 20476, offset 0, flags [none],
proto UDP (17), length 84) 192.168.1.1.941 > 192.168.1.2.111: [udp sum
ok] UDP, length 56
11:23:37.947878 IP (tos 0x0, ttl 64, id 1646, offset 0, flags [none],
proto UDP (17), length 56, bad cksum 0 (->f0f3)!) 192.168.1.2.111 >
192.168.1.1.941: [bad udp cksum 5c6d!] UDP, length 28
11:23:37.947989 IP (tos 0x0, ttl 64, id 20477, offset 0, flags [none],
proto UDP (17), length 112) 192.168.1.1.783 > 192.168.1.2.971: [udp
sum ok] UDP, length 84
11:23:37.948111 IP (tos 0x0, ttl 64, id 1647, offset 0, flags [none],
proto UDP (17), length 96, bad cksum 0 (->f0ca)!) 192.168.1.2.971 >
192.168.1.1.783: [bad udp cksum b101!] UDP, length 68
11:23:37.948373 IP (tos 0x0, ttl 64, id 20478, offset 0, flags [DF],
proto TCP (6), length 60) 192.168.1.1.959257417 > 192.168.1.2.2049: 0
proc-1241513984
11:23:37.948391 IP (tos 0x0, ttl 64, id 1648, offset 0, flags [DF],
proto TCP (6), length 60, bad cksum 0 (->b0f8)!) 192.168.1.2.2049 >
192.168.1.1.959257417: reply Unknown rpc response code=3312979456 0
11:23:37.948421 IP (tos 0x0, ttl 64, id 20479, offset 0, flags [DF],
proto TCP (6), length 52) 192.168.1.1.902 > 192.168.1.2.2049: ., cksum
0xc3aa (correct), 1057567033:1057567033(0) ack 4289015039 win 8192
<nop,nop,timestamp 62010382 1051390181>
11:23:37.948462 IP (tos 0x0, ttl 64, id 20480, offset 0, flags [DF],
proto TCP (6), length 152) 192.168.1.1.905522926 > 192.168.1.2.2049:
100 fsinfo fh 1165,20536/2520064
11:23:37.948472 IP (tos 0x0, ttl 64, id 1649, offset 0, flags [DF],
proto TCP (6), length 52, bad cksum 0 (->b0ff)!) 192.168.1.2.2049 >
192.168.1.1.902: ., cksum 0x837a (incorrect (-> 0x718c), 1:1(0) ack
100 win 29114 <nop,nop,timestamp 1051390181 62010382>
11:23:37.948499 IP (tos 0x0, ttl 64, id 1650, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0da)!) 192.168.1.2.2049 >
192.168.1.1.905522926: reply ok 36 fsinfo ERROR: Stale NFS file handle
POST:
11:23:37.948554 IP (tos 0x0, ttl 64, id 20481, offset 0, flags [DF],
proto TCP (6), length 152) 192.168.1.1.905522927 > 192.168.1.2.2049:
100 fsinfo fh 1165,20536/2520064
11:23:37.948583 IP (tos 0x0, ttl 64, id 1651, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0d9)!) 192.168.1.2.2049 >
192.168.1.1.905522927: reply ok 36 fsinfo ERROR: Stale NFS file handle
POST:
11:23:37.948633 IP (tos 0x0, ttl 64, id 20482, offset 0, flags [DF],
proto TCP (6), length 152) 192.168.1.1.905522928 > 192.168.1.2.2049:
100 fsstat fh 1165,20536/2520064
11:23:37.948654 IP (tos 0x0, ttl 64, id 1652, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0d8)!) 192.168.1.2.2049 >
192.168.1.1.905522928: reply ok 36 fsstat ERROR: Input/output error
POST:
11:23:37.948709 IP (tos 0x0, ttl 64, id 20483, offset 0, flags [DF],
proto TCP (6), length 152) 192.168.1.1.905522929 > 192.168.1.2.2049:
100 fsinfo fh 1165,20536/2520064
11:23:37.948729 IP (tos 0x0, ttl 64, id 1653, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0d7)!) 192.168.1.2.2049 >
192.168.1.1.905522929: reply ok 36 fsinfo ERROR: Stale NFS file handle
POST:
11:23:38.050971 IP (tos 0x0, ttl 64, id 20485, offset 0, flags [DF],
proto TCP (6), length 52) 192.168.1.1.902 > 192.168.1.2.2049: ., cksum
0xa059 (correct), 400:400(0) ack 145 win 16588 <nop,nop,timestamp
62010483 1051390181>
11:23:46.672078 IP (tos 0x0, ttl 64, id 20520, offset 0, flags [DF],
proto TCP (6), length 148) 192.168.1.1.905522930 > 192.168.1.2.2049:
96 fsinfo fh 1165,20536/2520064
11:23:46.672109 IP (tos 0x0, ttl 64, id 1704, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0a4)!) 192.168.1.2.2049 >
192.168.1.1.905522930: reply ok 36 fsinfo ERROR: Stale NFS file handle
POST:
11:23:46.672163 IP (tos 0x0, ttl 64, id 20522, offset 0, flags [DF],
proto TCP (6), length 148) 192.168.1.1.905522932 > 192.168.1.2.2049:
96 fsinfo fh 1165,20536/2520064
11:23:46.672186 IP (tos 0x0, ttl 64, id 1705, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0a3)!) 192.168.1.2.2049 >
192.168.1.1.905522932: reply ok 36 fsinfo ERROR: Stale NFS file handle
POST:
11:23:46.672225 IP (tos 0x0, ttl 64, id 20523, offset 0, flags [DF],
proto TCP (6), length 148) 192.168.1.1.905522933 > 192.168.1.2.2049:
96 fsstat fh 1165,20536/2520064
11:23:46.672246 IP (tos 0x0, ttl 64, id 1706, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->b0a2)!) 192.168.1.2.2049 >
192.168.1.1.905522933: reply ok 36 fsstat ERROR: Input/output error
POST:
11:23:46.774735 IP (tos 0x0, ttl 64, id 20525, offset 0, flags [DF],
proto TCP (6), length 52) 192.168.1.1.902 > 192.168.1.2.2049: ., cksum
0x5c04 (correct), 688:688(0) ack 253 win 16588 <nop,nop,timestamp
62019032 1051398729>

# ls /usr/obj (results in "ls: /usr/obj: Input/output error")

11:27:24.938974 IP (tos 0x0, ttl 64, id 21469, offset 0, flags [DF],
proto TCP (6), length 152) 192.168.1.1.905522986 > 192.168.1.2.2049:
100 fsinfo fh 1165,20536/2520064
11:27:24.939014 IP (tos 0x0, ttl 64, id 2639, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->acfd)!) 192.168.1.2.2049 >
192.168.1.1.905522986: reply ok 36 fsinfo ERROR: Stale NFS file handle
POST:
11:27:24.939082 IP (tos 0x0, ttl 64, id 21470, offset 0, flags [DF],
proto TCP (6), length 156) 192.168.1.1.905522987 > 192.168.1.2.2049:
104 access fh 1165,20536/2520064 003f
11:27:24.939112 IP (tos 0x0, ttl 64, id 2640, offset 0, flags [DF],
proto TCP (6), length 88, bad cksum 0 (->acfc)!) 192.168.1.2.2049 >
192.168.1.1.905522987: reply ok 36 access ERROR: Input/output error
attr:
11:27:24.939159 IP (tos 0x0, ttl 64, id 21471, offset 0, flags [DF],
proto TCP (6), length 152) 192.168.1.1.905522988 > 192.168.1.2.2049:
100 getattr fh 1165,20536/2520064
11:27:24.939181 IP (tos 0x0, ttl 64, id 2641, offset 0, flags [DF],
proto TCP (6), length 84, bad cksum 0 (->acff)!) 192.168.1.2.2049 >
192.168.1.1.905522988: reply ok 32 getattr ERROR: Input/output error
11:27:25.041841 IP (tos 0x0, ttl 64, id 21474, offset 0, flags [DF],
proto TCP (6), length 52) 192.168.1.1.902 > 192.168.1.2.2049: ., cksum
0xbf65 (correct), 1057571769:1057571769(0) ack 4289016799 win 16588
<nop,nop,timestamp 62232916 1051612593>

Let me know if you need any more info.  For the time being I'm going
back to NFS_LEGACYRPC

Regards,
Navdeep
Received on Wed Nov 12 2008 - 18:36:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:37 UTC