On Wed, Sep 1, 2010 at 6:49 AM, John Baldwin <jhb_at_freebsd.org> wrote: > On Tuesday, August 31, 2010 2:53:12 pm mdf_at_freebsd.org wrote: >> On Tue, Aug 31, 2010 at 10:16 AM, <mdf_at_freebsd.org> wrote: >> > I recorded the stack any time ts->ts_cpu was set and when a thread was >> > migrated by sched_switch() I printed out the recorded info. Here's >> > what I found: >> > >> > >> > XXX bug 67957: moving 0xffffff003ff9b800 from 3 to 1 >> > [1]: pin 0 state 4 move 3 -> 1 done by 0xffffff000cc44000: >> > #0 0xffffffff802b36b4 at bug67957+0x84 >> > #1 0xffffffff802b5dd4 at sched_affinity+0xd4 >> > #2 0xffffffff8024a707 at cpuset_setthread+0x137 >> > #3 0xffffffff8024aeae at cpuset_setaffinity+0x21e >> > #4 0xffffffff804a82df at freebsd32_cpuset_setaffinity+0x4f >> > #5 0xffffffff80295f49 at isi_syscall+0x99 >> > #6 0xffffffff804a630e at ia32_syscall+0x1ce >> > #7 0xffffffff8046dc60 at Xint0x80_syscall+0x60 >> > [0]: pin 0 state 2 move 0 -> 3 done by 0xffffff000cc44000: >> > #0 0xffffffff802b36b4 at bug67957+0x84 >> > #1 0xffffffff802b4ad8 at sched_add+0xe8 >> > #2 0xffffffff8029b96a at create_thread+0x34a >> > #3 0xffffffff8029badc at kern_thr_new+0x8c >> > #4 0xffffffff804a8912 at freebsd32_thr_new+0x122 >> > #5 0xffffffff80295f49 at isi_syscall+0x99 >> > #6 0xffffffff804a630e at ia32_syscall+0x1ce >> > #7 0xffffffff8046dc60 at Xint0x80_syscall+0x60 >> > >> > So one thread in the process called cpuset_setaffinity(2), and another >> > thread in the process was forcibly migrated by the IPI while returning >> > from a syscall, while it had td_pinned set. >> > >> > Given this path, it seems reasonable to me to skip the migrate if we >> > notice THREAD_CAN_MIGRATE is false. >> > >> > Opinions? My debug code is below. I'll try to write a short testcase >> > that exhibits this bug. >> >> Just a few more thoughts on this. The check in sched_affinity for >> THREAD_CAN_MIGRATE is racy. Since witness uses sched_pin, it's not >> simple to take the THREAD lock around an increment of td_pinned. So >> I'm looking for suggestions on the best way to fix this issue. My >> thoughts: >> >> 1) add a check in sched_switch() for THREAD_CAN_MIGRATE >> 2) have WITNESS not use sched_pin, and take the THREAD lock when >> modifying td_pinned >> 3) have the IPI_PREEMPT handler notice the thread is pinned (and not >> trying to bind) and postpone the mi_switch, just like it postpones >> when a thread is in a critical section. >> >> Except for the potential complexity of implementation, I think (2) is >> the best solution. > > I don't like 2) only because you'd really need to fix all the other places > that use sched_pin() to grab the thread lock. That would be a bit expensive. > I think the problem is code that checks THREAD_CAN_MIGRATE() for running > threads that aren't curthread. > > The sched_affinity() bug is probably my fault FWIW. I think something along > the lines of 1) is the best approach. Maybe something like the patch below. > It moves the CPU assignment out of sched_affinity() and moves it into > sched_switch() instead so that it takes effect on the next context switch for > this thread. That is the point at which the CPU binding actually takes effect > anyway since sched_affinity() doesn't force an immediate context switch. This > patch is relative to 7. The patch for 9 is nearly identical (just a fixup for > ipi_cpu() vs ipi_selected()). > > Index: sched_ule.c > =================================================================== > --- sched_ule.c (revision 212065) > +++ sched_ule.c (working copy) > _at__at_ -1885,6 +1885,8 _at__at_ > srqflag = (flags & SW_PREEMPT) ? > SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED : > SRQ_OURSELF|SRQ_YIELDING; > + if (THREAD_CAN_MIGRATE(td) && !THREAD_CAN_SCHED(td, ts->tscpu)) > + ts->ts_cpu = sched_pickcpu(td, 0); > if (ts->ts_cpu == cpuid) > tdq_add(tdq, td, srqflag); > else > _at__at_ -2536,7 +2538,6 _at__at_ > { > #ifdef SMP > struct td_sched *ts; > - int cpu; > > THREAD_LOCK_ASSERT(td, MA_OWNED); > ts = td->td_sched; > _at__at_ -2550,17 +2551,13 _at__at_ > if (!TD_IS_RUNNING(td)) > return; > td->td_flags |= TDF_NEEDRESCHED; > - if (!THREAD_CAN_MIGRATE(td)) > - return; > /* > - * Assign the new cpu and force a switch before returning to > - * userspace. If the target thread is not running locally send > - * an ipi to force the issue. > + * Force a switch before returning to userspace. If the > + * target thread is not running locally send an ipi to force > + * the issue. > */ > - cpu = ts->ts_cpu; > - ts->ts_cpu = sched_pickcpu(td, 0); > - if (cpu != PCPU_GET(cpuid)) > - ipi_selected(1 << cpu, IPI_PREEMPT); > + if (td != curthread) > + ipi_selected(1 << ts->ts_cpu, IPI_PREEMPT); > #endif > } I will test this patch out; thanks for the help! Two questions: 1) How does a thread get moved between CPUs when it's not running? I see that we change the runqueue for non-running threads that are on a runqueue. Does the code always check for THREAD_CAN_SCHED when adding a sleeping thread to a runqueue on wakeup? 2) I assume sched_switch() runs a lot more often than sched_pin() or sched_affinity(), so it would make sense to add any performance penalty of increased work in either of those places than in the scheduler. I suppose the two memory references for THREAD_CAN_MIGRATE and THREAD_CAN_SCHED won't be that expensive. Thanks, matthewReceived on Wed Sep 01 2010 - 14:54:15 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC