RE: Crash dump problem - sleeping thread owns a non-sleepable lock during crash dump write

From: Matthew Fleming <matthew.fleming_at_isilon.com>
Date: Fri, 14 May 2010 08:42:44 -0700
>   As an aside, this is a quad-core in one package CPU (an X3363). On both
> this box and a similar one with an X5470, console messages continue to
> print out after "the system has been halted - press any key to reboot" -
> in particular, the shutdown makes a bunch of the "behind the scenes" man-
> agement stuff like the virtual keyboard and monitor appear. Plugging or
> unplugging USB devices will go through the whole deal of detecting and
> making their service available.

Oops, youre right that other CPUs are running.

The stop_cpus() call is only made if kdb is entered.  doadump() is called out of boot() which comes later.  At Isilon weve been running with a patch that does stop_cpus() pretty close to the front of panic(9).

As an design decision it seems reasonable to call stop_cpus() early in panic(9) simply because most causes for panic means something unexpected, and the sooner the other CPUs arent running the more likely it is that they dont do more damage, leaving the system in a more useful state for dump or {g,d}db analysis.  This should be done before dump or entering kdb.

Im ccing -current_at_ since I would like a small discussion of moving the stop_cpus() to earlier in panic.  If this change is agreeable I can roll up a patch and test it on CURRENT.  Im not sure yet how much of the other panic-related changes we have made at Isilon would be required.

Thanks,
matthew
Received on Fri May 14 2010 - 13:42:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:03 UTC