Re: nfs server issues

From: Michael Weiser <michael_at_weiser.dinsnail.net>
Date: Sun, 18 Apr 2004 15:11:31 +0200
On Fri, Apr 02, 2004 at 04:44:02PM -0800, Sean McNeil wrote:
> Bingo!  It looks like a problem with checksum offloading:

> 	ifconfig re0 -rxcsum -txcsum

> and now it no longer hangs.  Good call!  The NIC in question is:
Yesterday I realised that I have the same problem here with a
-CURRENT server and a linux-2.6.5 client. Reading works fine but on
writes of big files the nfs server will lock up gradually after about
10MB being written. First the transfer blocks but ssh and other services
will continue to work. Later on the machine gets severely locked up.
It's still pingable but the filesystem seems to be stuck somewhere. It
starts with an ls hanging when run on the server in the directory
written to by the client, later the ssh session itself gets locked.

> re0: <RealTek 8110S Single-chip Gigabit Ethernet> port 0xa400-0xa4ff mem
> 0xdf004000-0xdf0040ff irq 12 at device 11.0 on pci1
The client has a VIA Rhine II onboard NIC and the server a 3Com 3c905B
being configured as follows:

xl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        options=9<RXCSUM,VLAN_MTU>
        inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::246:3ed:fe38:ea5c%xl0 prefixlen 64 scopeid 0x1
        inet6 fec0::1:1 prefixlen 112
        ether 00:50:04:38:ea:5c
        media: Ethernet autoselect (100baseTX)
        status: active

First thing I tried switching off receive checksum unloading but that
didn't change anything. What is the VLAN_MTU option actually doing?

Then I tried switching back and forth between nfsv3 and nfsv2 as well as
tcp and udp transport. No effect either.

Then I booted FreeSBIE-1.0 on the client and mounted the same filesystem
off the server. With that it actually worked fine and gave realistic
throughput with udp and tcp. But when I set the same rsize and wsize
(8129/8192) values as on Linux the server got stuck again after 10MB.

After that I tried lowering the rsize/wsize on Linux as well. With
1024/1024 and 2048/2048 there's no lockup but throughput is at ~60KB/s
and the server seems to sync every single write. With 4096/4096 and
8192/8192 I get the lockups again.

Can anyone give me a hint which option to tune to get this working
reliably? I'm fairly new to FreeBSD and lack the necessary insight to
debug this on kernel/gdb level but I'd be happy to give it a try if
someone gave me a point to start.

Thanks in advance for any insights into this one.
-- 
Micha
Received on Sun Apr 18 2004 - 06:02:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:51 UTC