Re: Optimized copy&move (was: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs)

From: Attilio Rao <attilio_at_freebsd.org>
Date: Wed, 17 Jan 2007 22:15:07 +0100
2007/1/17, Ivan Voras <ivoras_at_fer.hr>:
> Bruce Evans wrote:
>
> > And MMX/XMM registers ar not needed to get movnt on machines with SSE2,
> > since movnti is part of SSE2.  This reduces the advantages of using MMX/XMM
> > registers on P4's and A64's in 32-bit mode to the non-nt parts of the
> > above (fully cached case), which I think are less important than the nt
> > parts.
>
> Hmm, I'm looking at i386/i386/support.s and there are several versions
> of bcopy and bmove functions, including some that optimize by using FPU
> registers (large_i586_bcopy_loop), and a version that uses movnti
> (sse2_pagezero), but I can't find the bit of magic which glues them to
> bzero() call.
>
> Also, as as I can tell by the comments, the FPU version works by
> manually saving context... why is this possible (i.e. won't something
> preempt it?)

They are just broken.
My implementation, which follows DragonFlyBSD patterns, just use a bts
(which is atomic) in order to set a "lock" and avoid thread migration
with scheduler pinning. This is enough to solve concurrency problems.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Wed Jan 17 2007 - 20:15:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC