John Baldwin wrote: > On Thursday 05 August 2004 01:04 am, Tim Robbins wrote: > >>Is there any particular reason why atomic_load_acq_*() and >>atomic_store_rel_*() are implemented with CMPXCHG and XCHG instead of >>MOV on i386/amd64 UP? > > > Actually, using mov instead of lock xchg for store_rel reduced performance in > some benchmarks Scott ran on an SMP machine, I'm guessing due to the higher > latency of locks becoming available to other CPUs. I'm still waiting for > benchmark results on UP to see if the change should be made under #ifndef SMP > or some such. > > >>Also, could we use MFENCE/LFENCE/SFENCE in combination with MOV on >>SMP systems instead of LOCK CMPXCHG / (implied LOCK) XCHG? > > > MFENCE and LFENCE only exist on the P4. SFENCE only exists on P3+, so to do > so you'd lose the ability to run on PII's and earlier. Also, if you use more > than SFENCE you lose PIII's. Note that amd64 could probably be changed > though since they might all have fences, in which case that might be > something to benchmark on both UP and SMP to see what kind of difference it > makes. > We always have the ability to define PENTIUM2_CPU, PENTIUM3_CPU, and PENTIUM4_CPU cpu types in the kernel and then ifdef the code appropriately (and ship with the lowest common denominator like we do for I386/I486/I586/I686.) ScottReceived on Thu Aug 05 2004 - 20:17:42 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:05 UTC