Re: Stop scheduler on panic

From: Attilio Rao <attilio_at_freebsd.org>
Date: Fri, 2 Dec 2011 21:20:47 +0100
2011/12/2 John Baldwin <jhb_at_freebsd.org>:
> On 12/2/11 12:18 PM, Attilio Rao wrote:
>>
>> 2011/12/2 John Baldwin<jhb_at_freebsd.org>:
>>>
>>> On 12/2/11 5:05 AM, Andriy Gapon wrote:
>>>>
>>>>
>>>> on 02/12/2011 06:36 John Baldwin said the following:
>>>>>
>>>>>
>>>>> Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true
>>>>> when
>>>>> kdb was
>>>>> active).  But I think these two changes should cover critical_exit()
>>>>> ok.
>>>>>
>>>>
>>>> I attempted to start a discussion about this a few times already :-)
>>>> Should we treat kdb context the same as SCHEDULER_STOPPED context (in
>>>> the
>>>> current definition) ?  That is, skip all locks in the same fashion?
>>>> There are pros and contras.
>>>
>>>
>>>
>>> kdb should not block on locks, no.  Most debugger commands should not go
>>> near locks anyway unless they are intended to carefully modify the
>>> existing
>>> system in a safe manner (such as the 'kill' command which should only be
>>> using try locks and fail if it cannot safely post the signal).
>>
>>
>> The biggest problem to KDB as the same as panic is that doing proper
>> 'continue' is impossible.
>> One of the features of the 'skip-locking' path is that it doesn't take
>> into account fast locking paths, where sometimes the lock can succeed
>> and other fails and you don't know about them. Also the restarted CPUs
>> can find corrupted datas (as they can be arbitrarely updated), I'm
>> sure it is too much panic prone.
>
>
> Yes, my thought is that kdb commands, etc. should be using dedicated
> routines that do not use locks whenever possible.  The problem of a user
> calling an arbitrary routine is not solvable (so I don't think we should try
> to solve that, you use 'call' at your own risk), but built-in commands
> should explicitly either 1) not use locking, or 2) only use try locks and
> fail out cleanly (including dropping any try locks acquired) if a try fails.
>  Now, that's an ideal view, I don't know how close we are to that in
> practice or if it is a realistically attainable goal.

So you are not in favor of giving KDB its own context?
There are some fallbacks (like, for example, bugs involving the
scheduler or switching mechanism but for that we can make a facility
like KDB_LITE if you want to debug a scheduler problem), but in
general that would avoid replicating code to avoid the locking.

If you don't want to give KDB its own context, we should work on a KPI
(or similar) that defines the command to serve as KDB commands, that
tries to keep things under control, etc.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Fri Dec 02 2011 - 19:20:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:21 UTC