hard deadlock(?) on -current; some debugging info, need help

From: Ted Faber <faber_at_isi.edu>
Date: Wed, 25 May 2005 17:18:06 -0700
Hi.

I'm experiencing intermittent but frequent deadlocks running under
-current.  The kernel I'm currently running, which exhibits the problem,
is from Monday morning: 

pun:~$ uname -a
FreeBSD pun.isi.edu 6.0-CURRENT FreeBSD 6.0-CURRENT #6: Mon May 23 08:07:01 PDT 2005     root_at_pun.isi.edu:/usr/obj/usr/src/sys/PUN  i386

The system slowly grinds to a halt, and the lockup seems to invlove the
disk system.  I have not found a sequence that triggers them (other than
trying to write mail to the list to report them), and I know how
difficult that makes things.  It is common to have 2-5 a day.  Even when
I can get to the debugger during a lockup, I cannot generate a crash
dump - the kernel reports starting the dump and moves no bytes.  WITNESS
and INVARIENTS report no information.

I have to physically unplug the machine to reboot.

I've attached a dmesg from a -v boot and the kernel config (the dmesg is
not from the lockup run).  Last friday when the system locked I had a
digital camera with me and took pictures of the ps output in the hopes
that someone could look at them.  These images are at 

http://www.isi.edu/~faber/tmp/deadlock/DSCN04{75,76,77,78,79,80,81,82}.JPG

I'm delighted to to my part to get this fixed, but I really don't even
know where to start.  I'm happy to gather whatever information I can
from the debugger and post.  I'm happy to try patches.  I'm happy to
test my hardware to see if it's the problem (if you'll suggest a way).
Let me know what I can do to help fix this. 

Any help at all would be great.

-- 
Ted Faber
http://www.isi.edu/~faber           PGP: http://www.isi.edu/~faber/pubkeys.asc
Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG 

Received on Wed May 25 2005 - 22:18:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:35 UTC