At Tue, 04 Mar 2008 11:14:15 -0800, Jason Evans wrote: > > gnn_at_freebsd.org wrote: > > One of the folks I'm working with found this. The following code, > > which yes, is just an example, is 1/2 as fast on 7.0-RELEASE as on > > 6.3. Where should I look to find out why? > > There is a definite performance problem an arena_run_alloc(), but I'm > happy to report that it was fixed in -current a while back. I plan to > MFC malloc to RELENG_7 within the next few weeks. > Great! > In a nutshell, the arena_run_alloc() performance problem is due to > using a linear search to find sufficiently large runs of mapped (but > currently unused) pages. There are caching mechanisms that speed up > the searches to some degree, but there are still some linear aspects > to the algorithm, so as memory usage increases, the searches take > progressively longer. In -current, this problem is solved by > maintaining red-black trees, so that arena_run_alloc() does a O(lg > n) tree search, rather than a O(n) iterative search. > > It's worth mentioning that the benchmark is of marginal use, due to > a simple (but common) flaw. At a minimum, a malloc benchmark should > touch all allocated memory at least once. Otherwise, the benchmark > is IMO too far removed from reality to measure anything of value, > since memory access patterns look nothing like those of an actual > application that dynamically allocates memory. Both phkmalloc and > jemalloc use data structures that are mostly disjunct from the > allocations (no headers), so the benchmark never even faults most > pages in. This is especially true for phkmalloc, so jemalloc is > unjustly penalized. If we were to include, say, dlmalloc in this > comparison, it would be even more heavily penalized due to touching > the pages while modifying allocation headers. Fair enough, I'll pass that on. Best, GeorgeReceived on Thu Mar 06 2008 - 18:22:16 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:28 UTC