Re: [patch] i386 pmap sysmaps_pcpu[] atomic access

From: Svatopluk Kraus <onwahe_at_gmail.com> Date: Mon, 18 Feb 2013 23:18:16 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:35 UTC

On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov
<kostikbel_at_gmail.com> wrote:
> On Mon, Feb 18, 2013 at 09:27:40PM +0100, Svatopluk Kraus wrote:
>> On Mon, Feb 18, 2013 at 6:09 PM, Konstantin Belousov
>> <kostikbel_at_gmail.com> wrote:
>> > On Mon, Feb 18, 2013 at 06:06:42PM +0100, Svatopluk Kraus wrote:
>> >> On Mon, Feb 18, 2013 at 4:08 PM, Konstantin Belousov
>> >> <kostikbel_at_gmail.com> wrote:
>> >> > On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote:
>> >> >> Hi,
>> >> >>
>> >> >>    the access to sysmaps_pcpu[] should be atomic with respect to
>> >> >> thread migration. Otherwise, a sysmaps for one CPU can be stolen by
>> >> >> another CPU and the purpose of per CPU sysmaps is broken. A patch is
>> >> >> enclosed.
>> >> > And, what are the problem caused by the 'otherwise' ?
>> >> > I do not see any.
>> >>
>> >> The 'otherwise' issue is the following:
>> >>
>> >> 1. A thread is running on CPU0.
>> >>
>> >>         sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
>> >>
>> >> 2. A sysmaps variable contains a pointer to 'CPU0' sysmaps.
>> >> 3. Now, the thread migrates into CPU1.
>> >> 4. However, the sysmaps variable still contains a pointers to 'CPU0' sysmaps.
>> >>
>> >>       mtx_lock(&sysmaps->lock);
>> >>
>> >> 4. The thread running on CPU1 locked 'CPU0' sysmaps mutex, so the
>> >> thread uselessly can block another thread running on CPU0. Maybe, it's
>> >> not a problem. However, it definitely goes against the reason why the
>> >> submaps (one for each CPU) exist.
>> > So what ?
>>
>> It depends. You don't understand it or you think it's ok? Tell me.
>>
> Both. I do not understand your concern, and I think that the code is fine.

Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was
simple. SMP case is more complex and rather new for me. Recently, I
was solving a problem with PCPU stuff. For example, PCPU_GET is
implemented by one instruction on i386 arch. So, a need of atomicity
with respect to interrupts can be overlooked. On load-store archs, the
implementation which works in SMP case is not so simple. And what
works in UP case (single PCPU), not works in SMP case. Believe me,
mysterious and sporadic 'mutex not owned' assertions and others ones
caused by curthreads mess, it takes a while ...

After this, I took a look at how PCPU stuff is used in whole kernel
and found out mentioned here i386 pmap issue. So, my concern is
following:

1. to verify my newly gained ideas how per CPU data should be used,
2. to decide how to change my implementation of pmap on ARM11mpcore,
as it's based on i386 pmap,
3. to make FreeBSD code better.

In the meanwhile, it looks that using data dedicated to one CPU on
another one is OK. However, I can't agree. At least, without comments,
it is misleading for anyone new in FreeBSD and makes code misty.

> Both threads in your description make useful progress, and computation
> proceeds correctly.

I thought, there is only one thread in my example. One thread running
on CPU1, but holding sysmaps dedicated to CPU0 instead of holding
sysmaps dedicated to CPU1. So, any thread running on CPU0 must wait
because the thread running on CPU1 is a thief. Futhermore, the idea
that a thread on CPU1 should hold data for CPU1 is not valid. So,
either some comment is missing in the code that it's OK or the using
of PCPU_GET(cpuid) is unneeded and some kind of free sysmaps list can
be used and it will serve better.

>>
>> >>
>> >>
>> >> > Really, taking the mutex while bind to a CPU could be deadlock-prone
>> >> > under some situations.
>> >> >
>> >> > This was discussed at least one more time. Might be, a comment saying that
>> >> > there is no issue should be added.
>> >>
>> >> I missed the discussion. Can you point me to it, please? A deadlock is
>> >> not problem here, however, I can be wrong, as I can't imagine now how
>> >> a simple pinning could lead into a deadlock at all.
>> > Because some other load on the bind cpu might prevent the thread from
>> > being scheduled.
>>
>> I'm afraid I still have no idea. On single CPU, a binding has no
>> meaning. Thus, if any deadlock exists then exists without binding too.
>> Hmm, you are talking about a deadlock caused by heavy CPU load? Is it
>> a deadlock at all? Anyhow, mutex is a lock with priority propagation,
>> isn't it?
>>
>
> When executing on single cpu, kernel sometimes make different decisions.
> Yes, the deadlock can be more precisely described as livelock.
>
> It might not make any matter for exactly this case, but still is useful
> to remember.