Re: Deadlocks with recent SMP current

From: Julian Elischer <julian_at_elischer.org>
Date: Sat, 14 Aug 2004 22:44:58 -0700
Jon Noack wrote:
> On 08/13/04 15:13, Scott Long wrote:
> 
>> Doug White wrote:
>>
>>> On Fri, 13 Aug 2004, Martin Blapp wrote:
>>>
>>>> Since yesterday I'm getting complete deadlocks. This time
>>>> unrelated the servers are nor loaded at all, the just freeze
>>>> after a while. No break into DDB possible at all.
>>>
>>>
>>> Welcome to the club; I've been having them on my -curent builder 
>>> since Aug 4. I'm going to set up a duplicate box and start 
>>> binary-searching for the offending commit(s).
>>>
>>> Preemption is the default, disabled.
>>>
>> > My box is a dual-600MHz P3 with 1GB RAM and running kde. A make -j3
>>
>>> buildworld will lock it up 75% of the time. It'll survive a 
>>> nonparallel build, and it'll survive a kernel build.
>>>
>>> Haven't tried WITNESS+INVARIANTS yet since it really dogs the
>>> machine. :)
>>
>>
>> Can you try the patch below? It's really only a band-aid, but might 
>> make things usable for now. Also, are more lockups being seen under 
>> ULE or under 4BSD. There was a recent change to ULE (rev 1.120 of 
>> sched_ule.c) that seems to have aggrivated the scheduler problems on 
>> my test systems.
>>
>> Scott
>>
>> Index: kern_switch.c
>> ===================================================================
>> RCS file: /usr/ncvs/src/sys/kern/kern_switch.c,v
>> retrieving revision 1.78
>> diff -u -r1.78 kern_switch.c
>> --- kern_switch.c       10 Aug 2004 00:26:25 -0000      1.78
>> +++ kern_switch.c       13 Aug 2004 20:11:27 -0000
>> _at__at_ -345,6 +345,8 _at__at_
>>                 return;
>>         }
>>
>> +       critical_enter();
>> +
>>         tda = kg->kg_last_assigned;
>>         if ((ke = td->td_kse) == NULL) {
>>                 if (kg->kg_idle_kses) {
>> _at__at_ -441,6 +443,7 _at__at_
>>                 CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d",
>>                         td, td->td_ksegrp, td->td_proc->p_pid);
>>         }
>> +       critical_exit();
>>  }
>>
>>  /*
> 
> 
> Here's a data point:
> My dual Pentium3 system has been up for 20+ hours with this patch. 
> Previously, it wouldn't survive for more than an hour or so (regardless 
> of load).


try the following change instead:
in maybe_preempt() in kern_switch.c

         ctd = curthread;
+        if ((ctd->td_kse == NULL) || (ctd->td_kse->ke_thread != ctd))
+               return (0);
         pri = td->td_priority;


> 
> Jon
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
Received on Sun Aug 15 2004 - 03:45:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:06 UTC