Re: proposed smp_rendezvous change

From: Andriy Gapon <avg_at_FreeBSD.org> Date: Tue, 17 May 2011 16:46:34 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:14 UTC

on 17/05/2011 14:56 John Baldwin said the following:
> On 5/17/11 4:03 AM, Andriy Gapon wrote:
>> Couldn't [Shouldn't] the whole:
>>
>>>>>       /* Ensure we have up-to-date values. */
>>>>>       atomic_add_acq_int(&smp_rv_waiters[0], 1);
>>>>>       while (smp_rv_waiters[0]<  smp_rv_ncpus)
>>>>>           cpu_spinwait();
>>
>> be just replaced with:
>>
>> rmb();
>>
>> Or a proper MI function that does just a read memory barrier, if rmb() is not that.
> 
> No, you could replace it with:
> 
>     atomic_add_acq_int(&smp_rv_waiters[0], 1);

What about
(void)atomic_load_acq(&smp_rv_waiters[0]);

In my opinion that should ensure that the hardware must post the latest value from
a master CPU to memory of smp_rv_waiters[0] and a slave CPU gets it from there.
And also, because of memory barriers inserted by store_rel on the master CPU and
load_acq on the slave CPU, the latest values of all other smp_rv_* fields should
become visible to the slave CPU.

> The key being that atomic_add_acq_int() will block (either in hardware or
> software) until it can safely perform the atomic operation.  That means waiting
> until the write to set smp_rv_waiters[0] to 0 by the rendezvous initiator is
> visible to the current CPU.
> 
> On some platforms a write by one CPU may not post instantly to other CPUs (e.g. it
> may sit in a store buffer).  That is fine so long as an attempt to update that
> value atomically (using cas or a conditional-store, etc.) fails.  For those
> platforms, the atomic(9) API is required to spin until it succeeds.
> 
> This is why the mtx code spins if it can't set MTX_CONTESTED for example.
> 

Thank you for the great explanation!
Taking sparc64 as an example, I think that atomic_load_acq uses a degenerate cas
call, which should take care of hardware synchronization.

-- 
Andriy Gapon