[Patch] panics/hangs with preemption and threads.

From: Julian Elischer <julian_at_elischer.org>
Date: Sat, 11 Sep 2004 23:39:37 -0700
Guys I think I found a (the?) major cause for the corruptions of the
ksegrp/thread runqueue for threaded processes when Premption is turned on..

When a thread is scheduled in setrunqueue() the firt thing that is done
is that it is put in the correct place in the ksegrp's run queue,.
then if it is in the top N spots (where N is the defined concurrency
and is usually <= NCPU) it is passed down to the system scheduler
using sched_add().
Sched_add can call maybe_preempt() which can decide to switch out the
current thread and switch to the new one immediatly.
The trouble with that is that we have already put the new one on the ksegrp's 
run queue! When that thread is next put on the run queue using setrunqueue()
it is already there, and we end up with an infinitly looping run queue.
Any code that follows that list will never end. and the system will freeze.

Here is a patch that solves it but I'm not happy about it..
John, you wrote the preemption code..
do you have any ideas about how to do this cleaner?

One possibility is to make sched_add return a value that indicates if the thread 
was handled immediatly.  that would allow setrunqueue to only set it into the 
ksegrp's run queue if it was not already handled.

Other suggestions welcome.




==== //depot/projects/nsched/sys/kern/kern_switch.c#21 - /home/julian/p4/nsched/sys/kern/kern_switch.c ====
_at__at_ -396,5 +396,9 _at__at_
 		return;
 	}
 
+	if (((flags & (SRQ_YIELDING|SRQ_OURSELF|SRQ_NOPREEMPT)) == 0) &&
+	    maybe_preempt(td))
+		return;
+
 	tda = kg->kg_last_assigned;
 	if ((kg->kg_avail_opennings <= 0) &&
_at__at_ -453,7 +457,7 _at__at_
 			kg->kg_last_assigned = td2;
 		}
 		kg->kg_avail_opennings--;
-		sched_add(td2, flags);
+		sched_add(td2, flags|SRQ_NOPREEMPT);
 	} else {
 		CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d",
 			td, td->td_ksegrp, td->td_proc->p_pid);
==== //depot/projects/nsched/sys/kern/sched_4bsd.c#48 - /home/julian/p4/nsched/sys/kern/sched_4bsd.c ====
_at__at_ -1018,7 +1018,8 _at__at_
 #endif
 
 		{
-			if (maybe_preempt(td))
+			if (((flags & SRQ_NOPREEMPT) == 0) &&
+			    maybe_preempt(td))
 				return;
 		}
 	}
==== //depot/projects/nsched/sys/kern/sched_ule.c#30 - /home/julian/p4/nsched/sys/kern/sched_ule.c ====
_at__at_ -1662,13 +1662,13 _at__at_
 
 	/* let jeff work out how to map the flags better */
 	/* I'm open to suggestions */
-	if (flags & SRQ_YIELDING)
+	if (flags & (SRQ_YIELDING|SRQ_NOPREEMPT)) {
 		/*
 		 * Preempting during switching can be bad JUJU
 		 * especially for KSE processes
 		 */
 		sched_add_internal(td, 0);
-	else
+	} else 
 		sched_add_internal(td, 1);
 }
 
==== //depot/projects/nsched/sys/sys/proc.h#29 - /home/julian/p4/nsched/sys/sys/proc.h ====
_at__at_ -658,6 +658,7 _at__at_
 #define SRQ_YIELDING	0x0001		/* we are yielding (from mi_switch) */
 #define SRQ_OURSELF	0x0002		/* it is ourself (from mi_switch) */
 #define SRQ_INTR	0x0004		/* it is probably urgent */
+#define SRQ_NOPREEMPT	0x0008		/* Just don't ok? */
 
 /* How values for thread_single(). */
 #define	SINGLE_NO_EXIT	0
Received on Sun Sep 12 2004 - 04:39:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:11 UTC