Re: CURRENT freezes on Laitude D520

From: Tai-hwa Liang <avatar_at_mmlab.cse.yzu.edu.tw>
Date: Tue, 12 Dec 2006 11:07:32 +0800 (CST)
On Mon, 11 Dec 2006, Robert Watson wrote:
> On Mon, 11 Dec 2006, Tai-hwa Liang wrote:
>
>>> WITNESS is available in RELENG_6, and should be used in combination with 
>>> INVARIANTS, DDB, KDB, and BREAK_TO_DEBUGGER to debug deadlocks.
>>
>>  Would a kernel with WITNESS/[KD]DB/BREAK_TO_DEBUGGER enabled but w/o 
>> INVARIANTS compiled adequate to dump useful information through remote 
>> serial console?
>
> It depends a lot on the deadlock.  The warnings you've attached below provide 
> a lot of information, however.
>
>>> It sounds like you need to follow the instructions for kernel debugging. 
>>> Depending on your tolerance of performance loss, downtime, etc, a good 
>>> starting point is to configure the kernel with INVARIANTS and WITNESS. 
>>> WITNESS is particularly important, if you can tolerate the performance 
>>> hit, as it warns of potential deadlocks, not just actual deadlocks.  Also, 
>>> compile
>> 
>> With WITNESS enabled(debug.mpsafenet=0), I got another three pf related 
>> warnings in the last 8 hours:
>
> Are you using uid/gid credential rules with pf?

   I presume that you're talking about "user" and "group" keywords in the
rule.  If that's the case, then no, we did not use uid/gid credential in
pf rules.

>>> the kernel with KDB, DDB, and BREAK_TO_DEBUGGER, and user a serial or 
>>> firewire console.  If the hang occurs, see if you can get into the 
>>> debugger, in which case the logged output from DDB for the following 
>>> commands would be very useful:
>>> 
>>> show pcpu
>>> show allpcpu
>>> trace
>>> alltrace
>>> ps
>>> show locks
>>> show alllocks
>>> show lockedvnods
>>> show uma
>>> show malloc
>>> 
>>> Please open a PR that describes your configuration, includes your kernel 
>>> config (since it seems quite customized), any loader.conf settings, a 
>>> detailed description of the problem, and the output.  I'd be quite 
>>> interested
>>
>>  Okay, I'll file a PR once I can collect more information with the serial 
>> console(probably weekend).  For now our system administrator is pretty 
>> nervous about my suggestion on turning debug.mpsafenet back to 1. ;)
>
> Thanks.
>
>>> in know, once the machine is in a hung state, whether the numlock light 
>>> goes on and off when you hit the numlock key on the keyboard.
>> 
>> The numlock light doesn't respond to the key when the machine is hanging; 
>> hence Ctrl-Alt-Esc wouldn't break to debugger.
>
> Serial break is significantly more reliable for getting into the debugger on 
> the system as it stands, as syscons requires the Giant lock while the serial 
> interrupt handler doesn't.  As a result, serial break can often get you into 
> the debugger even when Giant has been leaked.  The numlock light not going on 
> and off is a reasonable test of whether Giant has been leaked and/or 
> interrupts have been left disabled on all CPUs, as it means that the syscons 
> interrupt handler is unable to run, hence my inquiring.

   Thanks for the elaboration.  BTW, it appears to me that in order 
to use a serial console, I have to keep the following in /boot/loader.conf:

 	console="comconsole,vidconsole"

   I'd be interested to know that if there is any "console mux" like knobs
available such that console output can go comconsole and vidconsole
at the same time?  That is, with  aforementioned console order, booting 
message(after those 'highlighted' text messages generated by kernel printf()) 
only goes to serial console, not the video one.

-- 
Thanks,
Tai-hwa Liang
Received on Tue Dec 12 2006 - 02:07:34 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:03 UTC