Re: RELENG_7 and HEAD: bge causes system hang

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Mon, 26 Nov 2007 20:05:05 +0000 (GMT)
On Mon, 26 Nov 2007, Cristian KLEIN wrote:

>> I don't know the details of this particular situation, but I can speak to 
>> at least one known issue in DDB: right now, getting into DDB from a serial 
>> console is a very quick and straight forward path, requiring only the 
>> delivery of the serial interrupt and execution of its fast handler. The 
>> regular video console keypresses take a much more circuitous route, as 
>> syscons isn't MPSAFE, so include the scheduling of an ithread and 
>> acquisition of Giant.  As such, I've found breaking into the debugger much 
>> easier from a serial console for several years.  As Giant has been pushed 
>> off larger and larger parts of the kernel, the syscons break path has 
>> gotten a lot more reliable.
>
> That is very unfortunate. Newer laptops don't come with a serial port 
> anymore. As far as I know, using USB-to-serial converters won't work.

Many notebooks do, however, have firewire.  I've not read the firewire code or 
used firewire for debugging, so I can't comment on how effective breaks are, 
but I can say that one of the neatest things about firewire is that you can 
inspect the kernel memory of a host remotely even when it's frozen solid, 
which is pretty cool.  So if you have a notebook that is also without 
firewire, you may indeed be out of luck, but with firewire, you have a nice 
new option.

>> There will always be certain cases where a console break (serial or video) 
>> will not work, and those include cases where interrupts are disabled on all 
>> CPUs (such as if spinlocks are held on all CPUs, perhaps due to one being 
>> leaked and then a cascading deadline).  In that situation, there's nothing 
>> like a nice NMI button or IPMI NMI to get into the debugger :-).
>
> IIRC, spinlocks are not an issue anymore. The kernel will throw a message 
> like "spinlock held too long in file, line", and the issue can easily be 
> spotted.

Only on an SMP box -- the test is in the spin loop waiting for a spinlock, so 
only when a second CPU has to hang around for a long time waiting for the lock 
will that fire.  If you have a single-CPU box, it's just a hard wedge with 
interrupts disabled.

> Is there any way to forcibly enter the DDB on a serialless laptop, so future 
> problems like this will be spotted faster? Perhaps, should MPSAFEing syscons 
> get more attention?

I think getting an MPSAFE syscons would be desirable, but it's a non-trivial 
piece of work, especially if you take into account that it's tangled up in the 
tty code.  If you have firewire, that may be a useful option.  However, I 
would agree with an assertion that notebooks are becoming less useful as a 
development platform because of the omission of a real serial port.  One of 
the nice things about true serial ports is that you can run them in purely 
polled operation quite easily, so use them from within a debugger while 
interrupts are disasbled.  Unfortunately, USB controllers are very complex 
beasts, and do not lend themselves to low-level operation of this sort.

Robert N M Watson
Computer Laboratory
University of Cambridge
Received on Mon Nov 26 2007 - 19:05:12 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:23 UTC