Hello, Alright folks, I'm in some serious need for help/advice. I'm running FreeBSD 5.2.1 (-current) with a kernel/buildworld ran yesterday (7/16/2004) on a Dual Xeon 3.06ghz with hyperthreading enabled. The machine also has 2gb of ram and a scsi raided array with an intel storage raid array controller. (iir0) The machine functions as a nis client for accounts with home directories nfs mounted from a Solaris 9 machine. It's primary function is as a mail server, and what it is nfs sharing out is the spool folder. (/var/mail, in this case). I know all about the dangers of sharing out a mail spool, I don't need, or want, a lecture about proper operating procedures in this case. It's for legacy purposes and will be going away in due time. Anyway, its with this mount that I am experiencing these nfs problems. Now, to the nitty gritty. I am seeing periodic spikes from one of the nfsd children from about 10% of the cpu (via top) to 100% of the cpu. During times of this spike, even if the spike only reaches 40-50% of the cpu, the machine becomes dibilitatingly slow and stops responding to all other commands. Even issuing an 'ls' is difficult, let alone doing anything productive. While using top, the nfsd state will alternate between biowr, biord, *Giant (yeah, it even is requesting Giant locks). I have recompiled every single ounce of software that operates on /var/mail to only use fcntl locking (procmail/postfix/uw-imap (there's a patch by redhat to do that)) so that it is nfs friendly. Here's what I've tried to do to see if it made any difference. First, all mounts of /var/mail from other servers were using UDP, they have all been switched to tcp with a rsize and wsize of 1024. I've tried 4096, and 8192, both which make no difference. All clients are specifically forced to use NFSv3. I have also tried varying between a soft and hard mount, also, with no difference in these spikes. I also tried switching back to the 4BSD scheduler, to see if that might have beeen the issue, but it would appear that didn't make any difference as well, though the max load average I was seeing stayed a bit lower with ULE as upposed to the 4BSD scheduler. So, I'm really at the end of my rope right now, I have no idea what to do or what could be causing this. Any advice would be great, thanks. --MikeReceived on Fri Jul 16 2004 - 15:57:45 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:02 UTC