On Mon, 11 Dec 2006, Tai-hwa Liang wrote: >> WITNESS is available in RELENG_6, and should be used in combination with >> INVARIANTS, DDB, KDB, and BREAK_TO_DEBUGGER to debug deadlocks. > > Would a kernel with WITNESS/[KD]DB/BREAK_TO_DEBUGGER enabled but w/o > INVARIANTS compiled adequate to dump useful information through remote > serial console? It depends a lot on the deadlock. The warnings you've attached below provide a lot of information, however. >> It sounds like you need to follow the instructions for kernel debugging. >> Depending on your tolerance of performance loss, downtime, etc, a good >> starting point is to configure the kernel with INVARIANTS and WITNESS. >> WITNESS is particularly important, if you can tolerate the performance hit, >> as it warns of potential deadlocks, not just actual deadlocks. Also, >> compile > > With WITNESS enabled(debug.mpsafenet=0), I got another three pf related > warnings in the last 8 hours: Are you using uid/gid credential rules with pf? >> the kernel with KDB, DDB, and BREAK_TO_DEBUGGER, and user a serial or >> firewire console. If the hang occurs, see if you can get into the >> debugger, in which case the logged output from DDB for the following >> commands would be very useful: >> >> show pcpu >> show allpcpu >> trace >> alltrace >> ps >> show locks >> show alllocks >> show lockedvnods >> show uma >> show malloc >> >> Please open a PR that describes your configuration, includes your kernel >> config (since it seems quite customized), any loader.conf settings, a >> detailed description of the problem, and the output. I'd be quite >> interested > > Okay, I'll file a PR once I can collect more information with the serial > console(probably weekend). For now our system administrator is pretty > nervous about my suggestion on turning debug.mpsafenet back to 1. ;) Thanks. >> in know, once the machine is in a hung state, whether the numlock light >> goes on and off when you hit the numlock key on the keyboard. > > The numlock light doesn't respond to the key when the machine is hanging; > hence Ctrl-Alt-Esc wouldn't break to debugger. Serial break is significantly more reliable for getting into the debugger on the system as it stands, as syscons requires the Giant lock while the serial interrupt handler doesn't. As a result, serial break can often get you into the debugger even when Giant has been leaked. The numlock light not going on and off is a reasonable test of whether Giant has been leaked and/or interrupts have been left disabled on all CPUs, as it means that the syscons interrupt handler is unable to run, hence my inquiring. Robert N M Watson Computer Laboratory University of CambridgeReceived on Mon Dec 11 2006 - 13:32:11 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:03 UTC