re0 checksum offloading issue (Re: nfs server issues)

From: Jon Noack <noackjr_at_alumni.rice.edu>
Date: Fri, 02 Apr 2004 20:20:07 -0600
On 4/2/2004 6:44 PM, Sean McNeil wrote:
> On Fri, 2004-04-02 at 16:07, Dan Nelson wrote:
>>In the last episode (Apr 02), Sean McNeil said:
>>>On Fri, 2004-04-02 at 13:57, Dan Nelson wrote:
>>>>In the last episode (Apr 02), Sean McNeil said:
>>>>>OK, here is a tcpdump.  It is confusing. It looks like after the
>>>>>first fragment is received it is looking up some bazaar IP
>>>>>address....
>>>>>
>>>>>13:02:57.566952 free.mcneil.com.1360032988 > server.mcneil.com.nfs: 136 readdir fh 1002,54097/7890231 4096 bytes _at_ 0x000000000 (DF)
>>>>>13:02:57.567266 server.mcneil.com.nfs > free.mcneil.com.1360032988: reply ok 1472 readdir (frag 1645:1480_at_0+)
>>>>>13:02:57.567268 0.0.0.1 > 0.0.10.7: (frag 1645:4_at_1480)
>>>>
>>>>Weird.  Is this at the server or the client?
>>>
>>>This is a client-side dump.  Both server and client have MTU of 1500.
>>>
>>>Server side says:
>>>
>>>15:37:44.292564 IP free.mcneil.com.851449566 > server.mcneil.com.nfs: 136 readdir fh 1002,54097/7890231 4096 bytes _at_ 0x0
>>>15:37:44.292705 IP server.mcneil.com.nfs > free.mcneil.com.851449566: reply ok 1472 readdir
>>>15:37:44.292711 IP server.mcneil.com > free.mcneil.com: udp
>>>
>>>Is there something in a packet that tells rpc/nfs to reassemble with
>>>something other than the source/destination info?
>>
>>Neither RPC or NFS are involved with fragmentation.  That's all done at
>>the UDP level.  I wonder if it's a NIC problem.  Can you try a
>>different card (maybe even a different brand of card if possible)? 
>>another interesting test would be to get a hub and a 3rd machine, then
>>do dumps with the hub on the server's port, and then the client's port. 
>>If you get garbled frags in both places, I'd lean toward a NIC problem
>>on the server.  If your card supports checksum offloading, try
>>disabling it (ifconfig xx0 -rxcsum -txcsum).
> 
> Bingo!  It looks like a problem with checksum offloading:
> 
> 	ifconfig re0 -rxcsum -txcsum
> 
> and now it no longer hangs.  Good call!  The NIC in question is:
> 
> re0: <RealTek 8110S Single-chip Gigabit Ethernet> port 0xa400-0xa4ff mem
> 0xdf004000-0xdf0040ff irq 12 at device 11.0 on pci1
> 
> The extremely odd thing is http, ldap, samba, and many other services
> that go both to the box and are sent out via nat all work fine.  nfs is
> the only protocol I've seen that has an issue.
> 
> I am happy now :)
> 
> Cheers,
> Sean

Bill,
I'm copying you on this in case you (as the original author) can see 
anything wrong with the driver checksum offloading.

Start of thread:
http://lists.freebsd.org/pipermail/freebsd-current/2004-April/024966.html

Jon
Received on Fri Apr 02 2004 - 16:20:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:49 UTC