Re: Differences in malloc between 6 and 7?

From: Jason Evans <jasone_at_freebsd.org>
Date: Tue, 04 Mar 2008 11:14:15 -0800
gnn_at_freebsd.org wrote:
> One of the folks I'm working with found this.  The following code,
> which yes, is just an example, is 1/2 as fast on 7.0-RELEASE as on
> 6.3.  Where should I look to find out why?

There is a definite performance problem an arena_run_alloc(), but I'm 
happy to report that it was fixed in -current a while back.  I plan to 
MFC malloc to RELENG_7 within the next few weeks.

In a nutshell, the arena_run_alloc() performance problem is due to using 
  a linear search to find sufficiently large runs of mapped (but 
currently unused) pages.  There are caching mechanisms that speed up the 
searches to some degree, but there are still some linear aspects to the 
algorithm, so as memory usage increases, the searches take progressively 
longer.  In -current, this problem is solved by maintaining red-black 
trees, so that arena_run_alloc() does a O(lg n) tree search, rather than 
a O(n) iterative search.

It's worth mentioning that the benchmark is of marginal use, due to a 
simple (but common) flaw.  At a minimum, a malloc benchmark should touch 
all allocated memory at least once.  Otherwise, the benchmark is IMO too 
far removed from reality to measure anything of value, since memory 
access patterns look nothing like those of an actual application that 
dynamically allocates memory.  Both phkmalloc and jemalloc use data 
structures that are mostly disjunct from the allocations (no headers), 
so the benchmark never even faults most pages in.  This is especially 
true for phkmalloc, so jemalloc is unjustly penalized.  If we were to 
include, say, dlmalloc in this comparison, it would be even more heavily 
penalized due to touching the pages while modifying allocation headers.

Jason
Received on Tue Mar 04 2008 - 18:34:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:28 UTC