on 14/05/2011 18:25 John Baldwin said the following: > On 5/13/11 9:43 AM, Andriy Gapon wrote: >> >> This is a change in vein of what I've been doing in the xcpu branch and it's >> supposed to fix the issue by the recent commit that (probably unintentionally) >> stress-tests smp_rendezvous in TSC code. >> >> Non-essential changes: >> - ditch initial, and in my opinion useless, pre-setup rendezvous in >> smp_rendezvous_action() > > As long as IPIs ensure all data is up to date (I think this is certainly true on > x86) that is fine. Presumably sending an IPI has an implicit store barrier on > all other platforms as well? Well, one certainly can use IPIs as memory barrier, but my point was that we have other ways to have a memory barrier and using IPI for that was not necessary (and a little bit harmful to performance) in this case. >> Essential changes (the fix): >> - re-use freed smp_rv_waiters[2] to indicate that a slave/target is really fully >> done with rendezvous (i.e. it's not going to access any members of smp_rv_* >> pseudo-structure) >> - spin on smp_rv_waiters[2] upon _entry_ to smp_rendezvous_cpus() to not re-use >> the smp_rv_* pseudo-structure too early > > Hmmm, so this is not actually sufficient. NetApp ran into a very similar race > with virtual CPUs in BHyVe. In their case because virtual CPUs are threads that > can be preempted, they have a chance at a longer race. Just a quick question - have you noticed that because of the change above the smp_rv_waiters[2] of which I spoke was not the same smp_rv_waiters[2] as in the original cod? Because I "removed" smp_rv_waiters[0], smp_rv_waiters[2] is actually some new smp_rv_waiters[3]. And well, I think I described exactly the same scenario as you did in my email on the svn mailing list. So of course I had it in mind: http://www.mail-archive.com/svn-src-all_at_freebsd.org/msg38637.html My problem, I should have not mixed different changes into the same patch, for clarity. I should have provided two patches: one that adds smp_rv_waiters[3] and its handling and one that "removes" smp_rv_waiters[0]. I would to see my proposed patch actually tested, if possible, before it's dismissed :-) -- Andriy GaponReceived on Sun May 15 2011 - 05:10:19 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:14 UTC