On Wed, 17 Jan 2007 15:50:41 +1100 (EST) Bruce Evans <bde_at_zeta.org.au> wrote: > AXP: (my 5 year old system with a newer CPU): movq through MMX is 60% > faster than movsl for cached moves, but movdqa through XMM is only 4% > faster. movnt with block prefetch is 155% faster than movsl with no > prefetch, and 73% faster with no prefetch for both. > A64 in 32-bit mode: in between P4 and AXP (closer to AXP). movsl doesn't > lose by so much, and prefetchnta actually works so block prefetch is > not needed and there is a better chance of prefetching helping more > than benchmarks. This PDF is somewhat dated, but perhaps some of it still applies today: http://cdrom.amd.com/devconn/events/AMD_block_prefetch_paper.pdf -- Ricardo Nabinger Sanchez <rnsanchez_at_{gmail.com,wait4.org}> Powered by FreeBSD "Left to themselves, things tend to go from bad to worse."Received on Wed Jan 17 2007 - 14:41:17 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC