Re: [RFC] how to get the size of a malloc(9) block ?

From: Luigi Rizzo <rizzo_at_iet.unipi.it> Date: Fri, 29 Nov 2013 11:17:01 -0800 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:44 UTC

On Thu, Nov 28, 2013 at 7:13 AM, jb <jb.1234abcd_at_gmail.com> wrote:

> Luigi Rizzo <rizzo <at> iet.unipi.it> writes:
>
> > ...
> > But I don't understand why you find ksize()/malloc_usable_size()
> dangerous.
> > ...
>
> The original crime is commited when *usable size* (an implementation
> detail)
> is exported (leaked) to the caller.
> To be blunt, when a caller requests memory of certain size, and its
> request is
> satisfied, then it is not its business to learn details beyond that (and
> they
> should not be offered as well).
> The API should be sanitized, in kernel and user space.
> Otherwise, all kind of charlatans will try to play hair-raising games with
> it.
> If the caller wants to track the *requested size* programmatically, it is
> its
> business to do it and it can be done very easily.
>

There is a difference between applications peeking into
implementation details that should be hidden, and providing
instead limited and specific information through a well defined API.

In general (not in the specific code I am handling
and not something I personally need),
what the caller might want to do is optimize its requests
according to how system behaves, and it cannot do that
without some help from the below.

I have seen the following types of comments in this thread:

- "you should get it right the first time and never realloc"
  Maybe, but then the offending api is realloc() not ksize()

- "build your own allocator"
  Yes i do it when it makes sense,
  but sometimes it is either overkill or a bad idea (as it loses
  opportunities for global optimizations, duplicates code,
  takes memory in subsystem-specific freelists...)

- "what if ksize()/malloc_usable_size() lies ?"
  Well, that would be a bug in the allocator: if it says
  the memory is usable, it must be usable, period.

- "rather than ksize() i'll give you a fix for one use case"
  (the NO_REALLOC flag to realloc()).
  This i think would be a mistake -- it acknowledges the need
  for exposing some information but then only provides a
  specific fix for one use case.

I'll just restate that there are multiple situations where
an application might use some information on actual allocation
sizes:

- when it needs to extend memory and has a choice between
  a cheap realloc() (if extra space is available),
  chaining blocks (when the memcpy would be too expensive),
  give up and live with whatever space is available.

- when it has freedom in picking the block size
  and so it wants to optimize its requests basing on
  what the underlying allocator does.
  As an example, long ago FreeBSD was really suboptimal
  when you allocated blocks whose size was a power of 2,
  because the metadata was inline.
  These days, there is a different issue:
  powers of 2 are ok but blocks 2049 bytes
  and above seem to be padded to a multiple of 2048,
  leading to a huge overhead in some cases.

cheers
luigi