In this freebsd-hackers thread[1], a user reported that 10.1-RELEASE crashes during boot on a system with 3TB of RAM. As it turns out, when you have that much RAM ZFS autotunes itself to allocate a 6GB hash table. This triggers a nasty 32-bit integer truncation bug in malloc(9). malloc() calls uma_large_malloc(), but uma_large_malloc() accepts an int instead of a size_t and all kinds of hilarity can ensure from there. The user has confirmed that the page in [2] fixed the kernel from instantly panicking once zfs.ko was loaded. I'm a bit concerned about whether the patch as written is an MFC candidate though. uma_large_malloc() calls page_alloc() to actuallly allocate the memory, and page_alloc() also accepts an int size parameter. This is where things get tricky. The signature for page_alloc() is governed by the uma_alloc() typedef, as uma also uses it internally for allocating memory for uma_zones. There is even a uma_zone_set_allocf() API for overriding the default allocation function. So there's definitely an argument to be made the the signature of page_alloc() being a part of the stable ABI. I have no hesitation in saying that uma_large_malloc() is not a stable API and changing it is fair game. If uma_alloc() is a part of the stable API, then it's simple enough to commit a 64-bit safe allocation function for uma_large_malloc() to call and changing page_alloc() to call it instead. That commit can be MFC'ed, and a follow-up commit could convert the UMA APIs to use size_t everywhere. While I am at this, I'd like to also change the uma init/fini/ctor/dtor to also use size_t. I'm a little torn on this because this will definitely cause a lot of churn, both in the tree and for downstream consumers, and there's not necessarily going to be a big benefit to it. However, I suppose that the existence of machines where 4GB is less than 1% of system memory may mean that allocating 4GB at a time may not that outlandish. I can definitely be talked out of this though. [1] https://lists.freebsd.org/pipermail/freebsd-hardware/2015-March/007602.html [2] http://people.freebsd.org/~rstone/patches/vm_64bit_malloc.diffReceived on Fri Mar 13 2015 - 22:48:39 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:56 UTC