Re: core-dumping over NFS

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Mon, 12 Jan 2004 15:39:55 -0500 (EST)
On Mon, 12 Jan 2004, Mikhail Teterin wrote:

> I've observed the following bad behaviour of -current mostly related to
> dumping core of a buggy program over the NFS. 

Sounds unfortunate.  A quick starting question: Does the behavior change
at all if the core file does or doesn't already exist?  Another question: 
this is a FreeBSD binary, or is it emulated?  Could you send the output of
running 'file' on the binary? 

> 	. 5.2-CURRENT (Dec 14) client, Solaris-8 server:
> 		created core file is empty (zero sized).

Not sure how much RPC fun you want to have, but if you could do a tcpdump
of the RPC exchange here, would be very helpful.  Run ethereal on the
result, and look for creation/lookup of the file.  It would be interesting
to see if one of the RPCs is failing.  In particular, I notice that the
coredump code calls VOP_SETATTR() to truncate the file without checking
the return value.

> 	. 5.2-CURRENT (Dec 14) server, RedHat-9 client:
> 		core is created properly, but sometimes the server goes
> 		into a frenzy with the sys-component (bufdaemon) taking
> 		up the entire 100% of the CPU-time (P4 at 2GHz); it only
> 		writes _at_4Mb/s (~14% of the disk's bandwidth) and the
> 		only cure is to restart the /etc/rc.d/nfsd; trying to,
> 		for example, switch from X11 to a textual console, when
> 		this is happening reliably hangs the machine.

Er.  Ouch.  Can you confirm if there's an on-going series of RPCs from the
client driving the I/O, or if it's just things going nuts on the server? 
Also, what block size is the RedHat client using by default?  Could you
set up a serial console -- if so, do you get any interesting messages?

> 	. 5.2-CURRENT (Dec 14) server, 5.2-RC2 (Jan 10) client:
> 		dumps happen normally with rw-mounts, but mounting the
> 		FS read-only (so as to prevent core-dumps) leads to a
> 		panic on the client...
> 
> The mounts are regular and default (v3?), except for the ``intr'' flag. 
> No rpc.lock or anything... 

Could you provide the panic message and stack trace? 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Senior Research Scientist, McAfee Research
Received on Mon Jan 12 2004 - 11:42:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:37 UTC