Re: smp_rendezvous_action: Are atomics correctly used ?

From: Alexandre Martins <alexandre.martins_at_stormshield.eu>
Date: Thu, 09 Mar 2017 15:54:01 +0100
Le jeudi 9 mars 2017, 16:25:17 Konstantin Belousov a écrit :
> On Thu, Mar 09, 2017 at 02:52:09PM +0100, Alexandre Martins wrote:
> > Le jeudi 9 mars 2017, 15:07:54 Konstantin Belousov a ?crit :
> > > On Thu, Mar 09, 2017 at 10:59:27AM +0100, Alexandre Martins wrote:
> > > > I have the save question for the cpu_ipi_pending here:
> > > > 
> > > > https://svnweb.freebsd.org/base/head/sys/x86/x86/mp_x86.c?view=annotat
> > > > e#l1
> > > > 080>
> > > > 
> > > > Le jeudi 9 mars 2017, 10:43:14 Alexandre Martins a ?crit :
> > > > > Hello,
> > > > > 
> > > > > I'm curently reading the code of the function smp_rendezvous_action,
> > > > > in
> > > > > kern/subr_smp.c file. In that function, i see that the variable
> > > > > smp_rv_waiters is read in some while() loop in a non-atomic way.
> > > > > 
> > > > > https://svnweb.freebsd.org/base/head/sys/kern/subr_smp.c?view=annota
> > > > > te#l
> > > > > 412
> > > > > https://svnweb.freebsd.org/base/head/sys/kern/subr_smp.c?view=annota
> > > > > te#l
> > > > > 458
> > > > > https://svnweb.freebsd.org/base/head/sys/kern/subr_smp.c?view=annota
> > > > > te#l
> > > > > 472
> > > > > 
> > > > > I suspect one of my freeze to be due by that.
> > > 
> > > You should provide either evidence or, at least, some reasoning
> > > supporting
> > > your claims.
> > 
> > I curently have a software watchdog that triger and does a coredump. In
> > the
> > coredumps, I always see a CPU trying to write-lock a "rm lock". Every
> > time,
> > that CPU is spinning into the smp_rendezvous_action, in the first while
> > loop) while the others are into the idle threads.
> > 
> > The fact is that freeze is not clear and I start to search "exotic" causes
> > to explain it.
> 
> This sounds as the 'usual' deadlock, where some other thread owns rmlock in
> read mode.  I recommend you to follow the
> https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kernel
> debug-deadlocks.html

As habit, with theses options, in our test environment, it never happen. But 
at customers, in production, ... :-D

The only thing I have it's the coredump. In it, the rm_lock seems free of 
readers/writers. There is nothing in the pcpu->pc_rm_queue (of all CPU) and 
nothing in the rm->rm_activeReaders.

Thank you. It' s realy nice to try to help me !
-- 
Alexandre Martins
STORMSHIELD


Received on Thu Mar 09 2017 - 13:52:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC