Re: core-dumping over NFS

From: <mi+mx_at_aldan.algebra.com>
Date: Tue, 13 Jan 2004 16:33:55 -0500
On Mon, 12 Jan 2004, Mikhail Teterin wrote:

=> I've observed the following bad behaviour of -current mostly related to
=> dumping core of a buggy program over the NFS. 

=Sounds unfortunate.  A quick starting question: Does the behavior change
=at all if the core file does or doesn't already exist?

It always exist already. The program crashes at the very end of its life,
so I did not bother fixing it for a while.

=Another question: this is a FreeBSD binary, or is it emulated? Could
=you send the output of running 'file' on the binary?

It is a FreeBSD binary, but it is produced using the Intel's compiler
(and a lang/icc port).

> 	. 5.2-CURRENT (Dec 14) client, Solaris-8 server:
> 		created core file is empty (zero sized).

=> 	. 5.2-CURRENT (Dec 14) server, RedHat-9 client:
=> 		core is created properly, but sometimes the server goes
=> 		into a frenzy with the sys-component (bufdaemon) taking
=> 		up the entire 100% of the CPU-time (P4 at 2GHz); it only
=> 		writes _at_4Mb/s (~14% of the disk's bandwidth) and the
=> 		only cure is to restart the /etc/rc.d/nfsd; trying to,
=> 		for example, switch from X11 to a textual console, when
=> 		this is happening reliably hangs the machine.

=Er. Ouch. Can you confirm if there's an on-going series of RPCs from
=the client driving the I/O, or if it's just things going nuts on the
=server?

Can't tell right now... But it does not always happen -- usually (say, 90%
of the time), the program will just dump core and die. 

=Also, what block size is the RedHat client using by default?
=Could you set up a serial console -- if so, do you get any interesting
=messages?

The machine will survive this "storm" if I restart nfs and there will be
nothing interesting in /var/log/messages.

=> 	. 5.2-CURRENT (Dec 14) server, 5.2-RC2 (Jan 10) client:
=> 		dumps happen normally with rw-mounts, but mounting the
=> 		FS read-only (so as to prevent core-dumps) leads to a
=> 		panic on the client...
=> 
=> The mounts are regular and default (v3?), except for the ``intr'' flag. 
=> No rpc.lock or anything... 

=Could you provide the panic message and stack trace? 

Not right now. The machine is in use by another person...

	-mi
Received on Tue Jan 13 2004 - 12:34:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:37 UTC