Re: smp_rendezvous runs with interrupts and preemption enabled on unicore systems

From: <mdf_at_FreeBSD.org>
Date: Fri, 28 Oct 2011 10:07:13 -0700
On Fri, Oct 28, 2011 at 8:37 AM, Ryan Stone <rysto32_at_gmail.com> wrote:
> I'm seeing issues on a unicore systems running a derivative of FreeBSD
> 8.2-RELEASE if something calls mem_range_attr_set.  It turns out that
> the root cause is a bug in smp_rendezvous_cpus.  The first part of
> smp_rendezvous_cpus attempts to short-circuit the non-SMP case(note
> that smp_started is never set to 1 on a unicore system):
>
>        if (!smp_started) {
>                if (setup_func != NULL)
>                        setup_func(arg);
>                if (action_func != NULL)
>                        action_func(arg);
>                if (teardown_func != NULL)
>                        teardown_func(arg);
>                return;
>        }
>
> The problem is that this runs with interrupts enabled, outside of a
> critical section.  My system runs with device_polling enabled with hz
> set to 2500, so its quite easy to wedge the system by having a thread
> run mem_range_attr_set.  That has to do a smp_rendezvous, and if a
> timer interrupt happens to go off half-way through the action_func and
> preempt this thread, the system ends up deadlocked(although once it's
> wedged, typing at the serial console stands a good chance of unwedging
> the system.  Go figure).
>
> I know that smp_rendezvous was reworked substantially on HEAD, but by
> inspection it looks like the bug is still present, as the
> short-circuit behaviour is still there.
>
> I am not entirely sure of the best way to fix this.  Is it as simple
> as doing a spinlock_enter before setup_func and a spinlock_exit after
> teardown_func?  It seems to boot fine, but I'm not at all confident
> that I understand the nuances of smp_rendezvous to be sure that there
> aren't side effects that I don't know about.

Looks like Max didn't have time to upstream this fix:

 struct mtx smp_ipi_mtx;
+MTX_SYSINIT(smp_ipi_mtx, &smp_ipi_mtx, "smp rendezvous", MTX_SPIN);

...

 static void
 mp_start(void *dummy)
 {

-	mtx_init(&smp_ipi_mtx, "smp rendezvous", NULL, MTX_SPIN);

...

 	if (!smp_started) {
+		mtx_lock_spin(&smp_ipi_mtx);
 		if (setup_func != NULL)
 			setup_func(arg);
 		if (action_func != NULL)
 			action_func(arg);
 		if (teardown_func != NULL)
 			teardown_func(arg);
+		mtx_unlock_spin(&smp_ipi_mtx);
 		return;
 	}

Cheers,
matthew
Received on Fri Oct 28 2011 - 15:33:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:19 UTC