On Thu, 12 Apr 2007, Dag-Erling [iso-8859-1] Smørgrav wrote: > Craig Boston <craig_at_xfoil.gank.org> writes: >> On Thu, Apr 12, 2007 at 11:06:03AM -0500, Rick C. Petty wrote: >>> Is there any way we could make the choice at boot time, by checking for >>> presence of the CX8 feature? Either as something like: >>> >>> extern int feature_cx8; /* or MIB variable */ >>> #define CMPXCHG8(a) (feature_cx8 ? { _asm "..." } : emulate_cmpxch8(a)) >> For something this low level my opinion is it's better to stay with >> compile time options. After all, in the above example, cmpxchg8 is a >> single machine instruction. How much overhead does it add to retrieve a >> variable from memory and check it, then jump to the correct place? >> Enough that it outweighs the benefit of using that instruction in the >> first place? Not for cmpxchg8b, at least. It is a remarkably slow instruction. On AthlonXP's it has an execution latency of 39 cycles. cmpxchg only has an cmpxchg only has an execution latency of 6 cycles (both without a lock prefix). I don't know how to avoid using cmpxchg8b short of using a mutex lock/unlock pair and slightly different semantics, or a generation count and very different semantics, but without lock prefixes the mutex pair would be much faster than the cmpxchg8b. > I don't think it matters. I agree. > Contrary to popular belief, atomic > operations are *expensive*. Doesn't everyone who uses atomic operations knows that they are expensive? :) > In the best case, on a UP machine, they > stall the pipeline. In the worst case, on an SMP machine, they stall > the entire memory bus. In the UP case, the pipeline stall is tiny or null. Independent instructions can still proceed, but CPUs (that have pipelines) usually can't keep pipelines moving anyway, and atomic instructions just reduce the chance that they can a little. BruceReceived on Fri Apr 13 2007 - 05:35:01 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:08 UTC