Re: Stop scheduler on panic

From: Attilio Rao <attilio_at_freebsd.org>
Date: Thu, 17 Nov 2011 21:54:19 +0100
2011/11/17 Andriy Gapon <avg_at_freebsd.org>:
> on 17/11/2011 21:09 John Baldwin said the following:
>> On Thursday, November 17, 2011 11:58:03 am Andriy Gapon wrote:
>>> on 17/11/2011 18:37 John Baldwin said the following:
>>>> On Thursday, November 17, 2011 4:47:42 am Andriy Gapon wrote:
>>>>> on 17/11/2011 10:34 Andriy Gapon said the following:
>>>>>> on 17/11/2011 10:15 Kostik Belousov said the following:
>>>>>>> I have the following change for eons on my test boxes. Without it,
>>>>>>> I simply cannot get _any_ dump.
>>>>>>>
>>>>>>> diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c
>>>>>>> index 10b89c7..a38e42f 100644
>>>>>>> --- a/sys/cam/cam_xpt.c
>>>>>>> +++ b/sys/cam/cam_xpt.c
>>>>>>> _at__at_ -4230,7 +4230,7 _at__at_ xpt_done(union ccb *done_ccb)
>>>>>>>                          TAILQ_INSERT_TAIL(&cam_simq, sim, links);
>>>>>>>                          mtx_unlock(&cam_simq_lock);
>>>>>>>                          sim->flags |= CAM_SIM_ON_DONEQ;
>>>>>>> -                        if (first)
>>>>>>> +                        if (first && panicstr == NULL)
>>>>>>>                                  swi_sched(cambio_ih, 0);
>>>>>>>                  }
>>>>>>>          }
>>>>>>
>>>>>> I think that this (or similar) change should go into the patch and the tree.
>>>>>>
>>>>>
>>>>> And, BTW, I still would like to do something like the following (perhaps with
>>>>> td_oncpu = NOCPU and td_flags &= ~TDF_NEEDRESCHED also moved to the common code):
>>>>>
>>>>> Index: sys/kern/sched_ule.c
>>>>> ===================================================================
>>>>> --- sys/kern/sched_ule.c   (revision 227608)
>>>>> +++ sys/kern/sched_ule.c   (working copy)
>>>>> _at__at_ -1790,7 +1790,6 _at__at_ sched_switch(struct thread *td, struct thread *new
>>>>>    td->td_oncpu = NOCPU;
>>>>>    if (!(flags & SW_PREEMPT))
>>>>>            td->td_flags &= ~TDF_NEEDRESCHED;
>>>>> -  td->td_owepreempt = 0;
>>>>>    tdq->tdq_switchcnt++;
>>>>>    /*
>>>>>     * The lock pointer in an idle thread should never change.  Reset it
>>>>> Index: sys/kern/kern_synch.c
>>>>> ===================================================================
>>>>> --- sys/kern/kern_synch.c  (revision 227608)
>>>>> +++ sys/kern/kern_synch.c  (working copy)
>>>>> _at__at_ -406,6 +406,8 _at__at_ mi_switch(int flags, struct thread *newtd)
>>>>>        ("mi_switch: switch must be voluntary or involuntary"));
>>>>>    KASSERT(newtd != curthread, ("mi_switch: preempting back to ourself"));
>>>>>
>>>>> +  td->td_owepreempt = 0;
>>>>> +
>>>>>    /*
>>>>>     * Don't perform context switches from the debugger.
>>>>>     */
>>>>> Index: sys/kern/sched_4bsd.c
>>>>> ===================================================================
>>>>> --- sys/kern/sched_4bsd.c  (revision 227608)
>>>>> +++ sys/kern/sched_4bsd.c  (working copy)
>>>>> _at__at_ -940,7 +940,6 _at__at_ sched_switch(struct thread *td, struct thread *new
>>>>>    td->td_lastcpu = td->td_oncpu;
>>>>>    if (!(flags & SW_PREEMPT))
>>>>>            td->td_flags &= ~TDF_NEEDRESCHED;
>>>>> -  td->td_owepreempt = 0;
>>>>>    td->td_oncpu = NOCPU;
>>>>>
>>>>>    /*
>>>>>
>>>>> Does anybody see any potential problems with such a change?
>>>>
>>>> Hmm, does this mean the preemption will be lost if you break into the
>>>> debugger and continue in the non-panic case?
>>>
>>> Not sure which exact scenario you have in mind.
>>> Please note that the above diff just moves resetting of td_owepreempt to an
>>> earlier place.  As far as I can see there are no checks of td_owepreempt value
>>> between the new place and the old places.
>>
>> I'm worried that you are clearing td_owepreempt even in cases where a context
>> switch is not performed.  So say you enter DDB with td_owepreempt set and that
>> DDB bails on a context switch.  With your change it will now clear td_owepreempt
>> and "lose" the preemption.
>>
>
> And without the change we get the recursion and double-fault because of
> kdb_switch -> thread_unlock ->  spinlock_exit ->  critical_exit ->
> mi_switch in this case ?
>
> BTW, it is my opinion that we really should not let the debugger code call
> mi_switch for any reason.

Yes, I agree with this, this is why the sched_bind() in boot() is
broken (immagine calling things like doadump from KDB. KDB right now
can be thought as a first cut of this patch because it does disable
the CPUs when entering the context, thus, the bug here is that if you
stop all CPUs including CPU0 and later on you want bind on it you are
death).

We need to discuss this and the patch more extensively, I'm taking my
time and hopefully will do a full review during the weekend.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Thu Nov 17 2011 - 19:54:32 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:20 UTC