Guys I think I found a (the?) major cause for the corruptions of the ksegrp/thread runqueue for threaded processes when Premption is turned on.. When a thread is scheduled in setrunqueue() the firt thing that is done is that it is put in the correct place in the ksegrp's run queue,. then if it is in the top N spots (where N is the defined concurrency and is usually <= NCPU) it is passed down to the system scheduler using sched_add(). Sched_add can call maybe_preempt() which can decide to switch out the current thread and switch to the new one immediatly. The trouble with that is that we have already put the new one on the ksegrp's run queue! When that thread is next put on the run queue using setrunqueue() it is already there, and we end up with an infinitly looping run queue. Any code that follows that list will never end. and the system will freeze. Here is a patch that solves it but I'm not happy about it.. John, you wrote the preemption code.. do you have any ideas about how to do this cleaner? One possibility is to make sched_add return a value that indicates if the thread was handled immediatly. that would allow setrunqueue to only set it into the ksegrp's run queue if it was not already handled. Other suggestions welcome. ==== //depot/projects/nsched/sys/kern/kern_switch.c#21 - /home/julian/p4/nsched/sys/kern/kern_switch.c ==== _at__at_ -396,5 +396,9 _at__at_ return; } + if (((flags & (SRQ_YIELDING|SRQ_OURSELF|SRQ_NOPREEMPT)) == 0) && + maybe_preempt(td)) + return; + tda = kg->kg_last_assigned; if ((kg->kg_avail_opennings <= 0) && _at__at_ -453,7 +457,7 _at__at_ kg->kg_last_assigned = td2; } kg->kg_avail_opennings--; - sched_add(td2, flags); + sched_add(td2, flags|SRQ_NOPREEMPT); } else { CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d", td, td->td_ksegrp, td->td_proc->p_pid); ==== //depot/projects/nsched/sys/kern/sched_4bsd.c#48 - /home/julian/p4/nsched/sys/kern/sched_4bsd.c ==== _at__at_ -1018,7 +1018,8 _at__at_ #endif { - if (maybe_preempt(td)) + if (((flags & SRQ_NOPREEMPT) == 0) && + maybe_preempt(td)) return; } } ==== //depot/projects/nsched/sys/kern/sched_ule.c#30 - /home/julian/p4/nsched/sys/kern/sched_ule.c ==== _at__at_ -1662,13 +1662,13 _at__at_ /* let jeff work out how to map the flags better */ /* I'm open to suggestions */ - if (flags & SRQ_YIELDING) + if (flags & (SRQ_YIELDING|SRQ_NOPREEMPT)) { /* * Preempting during switching can be bad JUJU * especially for KSE processes */ sched_add_internal(td, 0); - else + } else sched_add_internal(td, 1); } ==== //depot/projects/nsched/sys/sys/proc.h#29 - /home/julian/p4/nsched/sys/sys/proc.h ==== _at__at_ -658,6 +658,7 _at__at_ #define SRQ_YIELDING 0x0001 /* we are yielding (from mi_switch) */ #define SRQ_OURSELF 0x0002 /* it is ourself (from mi_switch) */ #define SRQ_INTR 0x0004 /* it is probably urgent */ +#define SRQ_NOPREEMPT 0x0008 /* Just don't ok? */ /* How values for thread_single(). */ #define SINGLE_NO_EXIT 0Received on Sun Sep 12 2004 - 04:39:45 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:11 UTC