amd/nfs deadlock on -current w/example code (was Re: hard deadlock(?) on -current; some debugging info, need help)

From: Ted Faber <faber_at_isi.edu>
Date: Fri, 27 May 2005 14:49:08 -0700
With a bunch of help from Peter Jeremy, I've been able to narrow my
deadlock problem down to a repeatable case.

If you're just tuning in, -current running amd and nfs deadlocks to the
point where panic or call doadump from the debugger fails to generate a
crash dump.  I think that the included code and configs will let you
cause this on your own machine.

The scenario is this:
	amd mounted NFS directory
	reasonable read/write load from different processes
	eventually the machine deadlocks

Amd configuration files and test code are attached.  On my machines,
compiling loadit and running loadit 35 /nfs/jade/faber (which just
happens to be an nfs directory I use) locks up the system repeatably.

loadit just forks n processes (arg1) to create files called lock${n} in
the given directory (arg2).  There's no locking done, but I named the
files when I thought locking might be at issue.  Each process accesses
only its own lock file and makes some small writes/reads to it 10000
times.

It seems to matter that the amd is managing multiple maps, though I only
touch the /nfs one.  I couldn't get it to lock only loading the nfs map.

The amd configurations and code are attached.  in etc/rc.conf I do:

amd_enable="YES"
amd_flags="-F /usr/local/etc/amd.conf"

I'm happy to supply more data, file a pr, test patches or anything.  I'm
also working around by statically mounting the most common filesystems I
use.

Thanks again to Peter for holding my hand until I could get a repeatable
case.

-- 
Ted Faber
http://www.isi.edu/~faber           PGP: http://www.isi.edu/~faber/pubkeys.asc
Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG

Received on Fri May 27 2005 - 19:49:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:35 UTC