Re: Native preemption is the culprit [was Re: today's CURRENT lockups]

From: Taku YAMAMOTO <taku_at_tackymt.homeip.net>
Date: Thu, 8 Jul 2004 22:21:43 +0900
greetings,


A quick glance showed me that there are some interesting code paths in
sched_ule.c that can be problematic in SMP case.

  1. sched_choose() => kseq_idled() => sched_add()
  2. sched_choose() => kseq_assign() => sched_add()
  3. sched_runnable() => kseq_assign() => sched_add()

Here is the patch that re-enables preemption except for the above three
cases.

--- sched_ule.c.orig	Tue Jul  6 14:57:29 2004
+++ sched_ule.c	Thu Jul  8 06:37:30 2004
_at__at_ -286,6 +286,7 _at__at_
 static void sched_balance_groups(void);
 static void sched_balance_group(struct kseq_group *ksg);
 static void sched_balance_pair(struct kseq *high, struct kseq *low);
+static void sched_add_internal(struct thread *td, int preemptive);
 static void kseq_move(struct kseq *from, int cpu);
 static int kseq_idled(struct kseq *kseq);
 static void kseq_notify(struct kse *ke, int cpu);
_at__at_ -616,7 +617,7 _at__at_
 			kseq_runq_rem(steal, ke);
 			kseq_load_rem(steal, ke);
 			ke->ke_cpu = PCPU_GET(cpuid);
-			sched_add(ke->ke_thread);
+			sched_add_internal(ke->ke_thread, 0);
 			return (0);
 		}
 	}
_at__at_ -644,7 +645,7 _at__at_
 	for (; ke != NULL; ke = nke) {
 		nke = ke->ke_assign;
 		ke->ke_flags &= ~KEF_ASSIGNED;
-		sched_add(ke->ke_thread);
+		sched_add_internal(ke->ke_thread, 0);
 	}
 }
 
_at__at_ -1542,6 +1543,14 _at__at_
 void
 sched_add(struct thread *td)
 {
+#ifdef SMP
+	sched_add_internal(td, 1);
+}
+
+static void
+sched_add_internal(struct thread *td, int preemptive)
+{
+#endif /* SMP */
 	struct kseq *kseq;
 	struct ksegrp *kg;
 	struct kse *ke;
_at__at_ -1623,17 +1632,21 _at__at_
         if (td->td_priority < curthread->td_priority)
                 curthread->td_flags |= TDF_NEEDRESCHED;
 
-#if 0
 #ifdef SMP
 	/*
 	 * Only try to preempt if the thread is unpinned or pinned to the
 	 * current CPU.
+	 * XXX - avoid preemption if called from sched_ule.c internally.
+	 * there're a few code pathes that may be problematic:
+	 *     sched_choose() => kseq_idled() => sched_add
+	 *     sched_choose() => kseq_assign() => sched_add
+	 *     sched_runnable() => kseq_assign() => sched_add
 	 */
-	if (KSE_CAN_MIGRATE(ke, class) || ke->ke_cpu == PCPU_GET(cpuid))
+	if (preemptive &&
+	    (KSE_CAN_MIGRATE(ke, class) || ke->ke_cpu == PCPU_GET(cpuid)))
 #endif
 	if (maybe_preempt(td))
 		return;
-#endif
 	ke->ke_ksegrp->kg_runq_kses++;
 	ke->ke_state = KES_ONRUNQ;
 

This patch is tested on P4_at_2.8GHz HTT-enabled machine.

It has been running for several hours without a hang, although I have to
admit that this machine is too idle, far from being stressed.


On Tue, 6 Jul 2004 00:14:21 -0400 (EDT)
Robert Watson <rwatson_at_freebsd.org> wrote:
> 
> (This time to more people)
> 
> The patch below appears to (brute force) eliminate the crash/hang I'm
> experiencing with SCHED_ULE in the post-preemption universe.  However, I
> was experiencing it only in the SMP case, not UP, so it could be I'm just
> not triggering it timing-wise.  This would be a temporary fix until jhb is
> online again post-USENIX to take a look, assuming this works around the
> problem for people other than me.
> 
> Note that this is probably damaging to interrupt processing latency.
> 
> Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
> robert_at_fledge.watson.org      Principal Research Scientist, McAfee Research

(snip)

> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"


-- 
-|-__    YAMAMOTO, Taku
 | __ <	    <taku_at_tackymt.homeip.net>

Post Scriptum to the people who know me as taku_at_cent.saitama-u.ac.jp:
	My email address has been changed since April,
	because I've left the university.
Received on Thu Jul 08 2004 - 11:21:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:00 UTC