On Jan 18, 2007, at 2:28 PM, Maxim Sobolev wrote: >> Unfortunately, there are simply different tradeoffs between >> mechanisms for copying depending on whether you want to use or >> avoid using/thrashing the L1/L2 caches, whether the data is cache- >> aligned, and so forth; the CPU can't infer what you want to >> occur-- you have to tell it. I find it interesting that some of >> the architectures (PA-RISC, > > Well, of course there are some special cases, but in general there > should be some baseline suitable for most of uses. That's why we > (and most other operating systems) only provide single version for > the mem*(3) APIs. Well, a truly generic version in is lib/libc/string/bcopy.c; it's architecture-neutral (ie, it's pure C code) and it handles all kinds of things like overlapping source and destination addresses, non- aligned access, and so forth. The downside is that it's slower than using movl/movsl, much less some of the fancier variants that Bruce and Matt have been discussing (in considerable, interesting detail) earlier: http://now.cs.berkeley.edu/Td/bcopy.html If you're only moving, say, 5 bytes, the overhead of fancy loop unrolling and prefetching and so forth isn't going to help compared with a simple movb/movl combination, so it really depends. -- -ChuckReceived on Thu Jan 18 2007 - 21:47:57 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC