What parts of UMA are part of the stable ABI?

From: Ryan Stone <rysto32_at_gmail.com>
Date: Fri, 13 Mar 2015 19:48:38 -0400
In this freebsd-hackers thread[1], a user reported that 10.1-RELEASE
crashes during boot on a system with 3TB of RAM.  As it turns out, when you
have that much RAM ZFS autotunes itself to allocate a 6GB hash table.  This
triggers a nasty 32-bit integer truncation bug in malloc(9).  malloc()
calls uma_large_malloc(), but uma_large_malloc() accepts an int instead of
a size_t and all kinds of hilarity can ensure from there.  The user has
confirmed that the page in [2] fixed the kernel from instantly panicking
once zfs.ko was loaded.  I'm a bit concerned about whether the patch as
written is an MFC candidate though.

uma_large_malloc() calls page_alloc() to actuallly allocate the memory, and
page_alloc() also accepts an int size parameter.  This is where things get
tricky.  The signature for page_alloc() is governed by the uma_alloc()
typedef, as uma also uses it internally for allocating memory for
uma_zones.  There is even a uma_zone_set_allocf() API for overriding the
default allocation function.  So there's definitely an argument to be made
the the signature of page_alloc() being a part of the stable ABI.

I have no hesitation in saying that uma_large_malloc() is not a stable API
and changing it is fair game.  If uma_alloc() is a part of the stable API,
then it's simple enough to commit a 64-bit safe allocation function for
uma_large_malloc() to call and changing page_alloc() to call it instead.
That commit can be MFC'ed, and a follow-up commit could convert the UMA
APIs to use size_t everywhere.

While I am at this, I'd like to also change the uma init/fini/ctor/dtor to
also use size_t.  I'm a little torn on this because this will definitely
cause a lot of churn, both in the tree and for downstream consumers, and
there's not necessarily going to be a big benefit to it.  However, I
suppose that the existence of machines where 4GB is less than 1% of system
memory may mean that allocating 4GB at a time may not that outlandish.  I
can definitely be talked out of this though.

[1]
https://lists.freebsd.org/pipermail/freebsd-hardware/2015-March/007602.html
[2] http://people.freebsd.org/~rstone/patches/vm_64bit_malloc.diff
Received on Fri Mar 13 2015 - 22:48:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:56 UTC