Re: proposed smp_rendezvous change

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Sun, 15 May 2011 10:10:13 +0300
on 14/05/2011 18:25 John Baldwin said the following:
> On 5/13/11 9:43 AM, Andriy Gapon wrote:
>>
>> This is a change in vein of what I've been doing in the xcpu branch and it's
>> supposed to fix the issue by the recent commit that (probably unintentionally)
>> stress-tests smp_rendezvous in TSC code.
>>
>> Non-essential changes:
>> - ditch initial, and in my opinion useless, pre-setup rendezvous in
>> smp_rendezvous_action()
> 
> As long as IPIs ensure all data is up to date (I think this is certainly true on
> x86) that is fine.  Presumably sending an IPI has an implicit store barrier on
> all other platforms as well?

Well, one certainly can use IPIs as memory barrier, but my point was that we
have other ways to have a memory barrier and using IPI for that was not
necessary (and a little bit harmful to performance) in this case.

>> Essential changes (the fix):
>> - re-use freed smp_rv_waiters[2] to indicate that a slave/target is really fully
>> done with rendezvous (i.e. it's not going to access any members of smp_rv_*
>> pseudo-structure)
>> - spin on smp_rv_waiters[2] upon _entry_ to smp_rendezvous_cpus() to not re-use
>> the smp_rv_* pseudo-structure too early
> 
> Hmmm, so this is not actually sufficient.  NetApp ran into a very similar race
> with virtual CPUs in BHyVe.  In their case because virtual CPUs are threads that
> can be preempted, they have a chance at a longer race.

Just a quick question - have you noticed that because of the change above the
smp_rv_waiters[2] of which I spoke was not the same smp_rv_waiters[2] as in the
original cod?  Because I "removed" smp_rv_waiters[0], smp_rv_waiters[2] is
actually some new smp_rv_waiters[3].

And well, I think I described exactly the same scenario as you did in my email
on the svn mailing list.  So of course I had it in mind:
http://www.mail-archive.com/svn-src-all_at_freebsd.org/msg38637.html

My problem, I should have not mixed different changes into the same patch, for
clarity.  I should have provided two patches: one that adds smp_rv_waiters[3]
and its handling and one that "removes" smp_rv_waiters[0].

I would to see my proposed patch actually tested, if possible, before it's
dismissed :-)

-- 
Andriy Gapon
Received on Sun May 15 2011 - 05:10:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:14 UTC