Re: 4.7 vs 5.2.1 SMP/UP bridging performance

From: Bruce Evans <bde_at_zeta.org.au>
Date: Fri, 7 May 2004 05:24:43 +1000 (EST)
On Thu, 6 May 2004, John Baldwin wrote:

> On Thursday 06 May 2004 06:18 am, Bruce Evans wrote:
> > On Wed, 5 May 2004, Gerrit Nagelhout wrote:
> > > Andrew Gallatin wrote:
> > > > If its really safe to remove the xchg* from non-SMP atomic_store_rel*,
> > > > then I think you should do it.  Of course, that still leaves mutexes
> > > > as very expensive on SMP (253 cycles on the 2.53GHz from above).
> >
> > See my other reply [1 memory barrier but not 2 seems to be needed for
> > each lock/unlock pair in the !SMP case, and the xchgl accidentally (?)
> > provides it; perhaps [lms]fence would give a faster memory barrier].
> > More ideas on this:
> > ...
> > - jhb once tried changing mtx_lolock_spin(mtx)/mtx_unlock_spin(mtx) to
> >   crticial_enter()/critical_exit().  This didn't work because it broke
> >   mtx_assert().  It might also not work because it removes the memory
> >   barrier.  criticial_enter() only has the very weak memory barrier in
> >   disable_intr() on i386's.
>
> That was only for the UP case, in which case you don't need the membar's.  A
> single CPU always consistently sees what it has written.  The only case when

Er, see my other reply about why one might be needed.  Actually, none is in
my example.  Interrupts aren't a problem because both hardware interrupts
and the iret instruction are serializing.

> it doesn't is for memory that can be written to by device DMA, and that
> doesn't apply to kernel data structures, esp. not to ones for scheduling,
> etc.

There are also cases from using the SSE2 movnt* instructions.  These act a
bit like DMA and require sfence.

Bruce
Received on Thu May 06 2004 - 10:24:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:53 UTC