Re: smp_rendezvous_action: Are atomics correctly used ?

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Fri, 10 Mar 2017 16:46:26 +0200
On Fri, Mar 10, 2017 at 03:30:21PM +0100, Alexandre Martins wrote:
> Le vendredi 10 mars 2017, 15:57:16 Konstantin Belousov a ?crit :
> > On Fri, Mar 10, 2017 at 02:24:52PM +0100, Alexandre Martins wrote:
> > > Le jeudi 9 mars 2017, 16:25:17 Konstantin Belousov a ?crit :
> > > > On Thu, Mar 09, 2017 at 02:52:09PM +0100, Alexandre Martins wrote:
> > > > > Le jeudi 9 mars 2017, 15:07:54 Konstantin Belousov a ?crit :
> > > > > > On Thu, Mar 09, 2017 at 10:59:27AM +0100, Alexandre Martins wrote:
> > > > > > > I have the save question for the cpu_ipi_pending here:
> > > > > > > 
> > > > > > > https://svnweb.freebsd.org/base/head/sys/x86/x86/mp_x86.c?view=ann
> > > > > > > otat
> > > > > > > e#l1
> > > > > > > 080>
> > > > > > > 
> > > > > > > Le jeudi 9 mars 2017, 10:43:14 Alexandre Martins a ?crit :
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > I'm curently reading the code of the function
> > > > > > > > smp_rendezvous_action,
> > > > > > > > in
> > > > > > > > kern/subr_smp.c file. In that function, i see that the variable
> > > > > > > > smp_rv_waiters is read in some while() loop in a non-atomic way.
> > > > > > > > 
> > > > > > > > https://svnweb.freebsd.org/base/head/sys/kern/subr_smp.c?view=an
> > > > > > > > nota
> > > > > > > > te#l
> > > > > > > > 412
> > > > > > > > https://svnweb.freebsd.org/base/head/sys/kern/subr_smp.c?view=an
> > > > > > > > nota
> > > > > > > > te#l
> > > > > > > > 458
> > > > > > > > https://svnweb.freebsd.org/base/head/sys/kern/subr_smp.c?view=an
> > > > > > > > nota
> > > > > > > > te#l
> > > > > > > > 472
> > > > > > > > 
> > > > > > > > I suspect one of my freeze to be due by that.
> > > > > > 
> > > > > > You should provide either evidence or, at least, some reasoning
> > > > > > supporting
> > > > > > your claims.
> > > > > 
> > > > > I curently have a software watchdog that triger and does a coredump.
> > > > > In
> > > > > the
> > > > > coredumps, I always see a CPU trying to write-lock a "rm lock". Every
> > > > > time,
> > > > > that CPU is spinning into the smp_rendezvous_action, in the first
> > > > > while
> > > > > loop) while the others are into the idle threads.
> > > > > 
> > > > > The fact is that freeze is not clear and I start to search "exotic"
> > > > > causes
> > > > > to explain it.
> > > > 
> > > > This sounds as the 'usual' deadlock, where some other thread owns rmlock
> > > > in
> > > > read mode.  I recommend you to follow the
> > > > https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/ke
> > > > rnel debug-deadlocks.html
> > > 
> > > Just a last question, for my personnal knowledge.
> > > 
> > > In ARM >= 6, for atomic acces, the code should (?) use LDREX and STREX
> > > for, I quote : "Use LDREX and STREX to implement interprocess
> > > communication in multiple-processor and shared-memory systems." (see here
> > > :
> > > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489e/Cihbg
> > > hef.html
> > In my previous response to you, I explicitely defined what 'atomic'
> > means when adjected to the term 'load'. The *EX instructions are used on
> > ll/sc architectures to implement read/modify/write atomic operations,
> > which are different from load (read) operations.
> 
> Ok ! Because we just want to read the value, there is no need to use the *EX 
> version. *EX is intended to be use when a modification will be done thereafter.
> 
> > 
> > > But, in that while loop, it's a standard "LDR" that is used. Is it correct
> > > too, and why ?
> > 
> > Which 'that while loop' ?
> > 	while (atomic_load_acq_int(&smp_rv_waiters[3]) < ncpus)
> > 		cpu_spinwait();
> > This one ?
> 
> No, I point the one at line 412, 458 and 472:
> 
> 412: while (smp_rv_waiters[0] < smp_rv_ncpus)
>          cpu_spinwait();
> 
> 458: while (smp_rv_waiters[1] < smp_rv_ncpus)
>          cpu_spinwait();
> 
> 472: while (smp_rv_waiters[2] < smp_rv_ncpus)
>          cpu_spinwait();
> 
> > 
> > Because the semantic of the normal load + DMB barrier provides the expected
> > semantic of atomic_load_acq(), as explained in atomic(9) and utilized by
> > the author of the code.
> 
> So, the writer must use LDREX/STREX to modify the value and use dmb to make 
> visible to other CPU the write.
No, this is false statement on all/many counts.  ll/sc is only needed for
atomic modification, not for a write.  If you need to assign a given value
to the variable, STR instruction does just that.  LDREX/STREX provide a
way to ensure that a modification done atomically.  E.g., if your intent
is to add 1 to the word in memory, you need to ensure that the memory
is not modified, when writing out the modified read value.

Next, DMB does not 'make visible' the modification. DMB separates
externally visible effects of executed instructions before and after it.
>From the whole guarantees provided by this separation, atomic_load_acq()
only needs the effect of not allowing later memory accesses to occur
earlier than the DMB instruction was executed (the acquire semantic).
ARMv8 provides loads and stores with the reduced barriers to implement
_acq/_rel without excess overhead of full barrier.

DMB does not make any store instruction more effective than it already is.

> 
> The readers can read simply the value without the barrier because cache 
> coherancy protcol will update the value automaticaly.
Same is true for stores.  This is why plain loads and stores are atomic.

> 
> I think I finally got it !
> Thank you so much !
> 
> Best regards,
> 
> -- 
> Alexandre Martins
> STORMSHIELD
> 
Received on Fri Mar 10 2017 - 13:46:33 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC