Re: panic: UMA: Increase vm.boot_pages with 32 CPUs

From: Jim Harris <jim.harris_at_gmail.com>
Date: Tue, 13 Aug 2013 18:03:59 -0700
On Tue, Aug 13, 2013 at 3:05 PM, Jeff Roberson <jroberson_at_jroberson.net>wrote:

> On Mon, 12 Aug 2013, Colin Percival wrote:
>
>  Hi all,
>>
>> A HEAD_at_254238 kernel fails to boot in EC2 with
>>
>>> panic: UMA: Increase vm.boot_pages
>>>
>> on 32-CPU instances.  Instances with up to 16 CPUs boot fine.
>>
>> I know there has been some mucking about with VM recently -- anyone want
>> to claim this, or should I start doing a binary search?
>>
>
> It's not any one commit really, just creeping demand for more pages before
> the VM can get started.  I would suggest making boot pages scale with
> MAXCPU.  Or just raising it as the panic suggests.  We could rewrite the
> way that the vm gets these early pages but it's a lot of work and typically
> people just bump it and forget about it.
>
>
I ran into this problem today when enabling hyperthreading on my
dual-socket Xeon E5 system.

It looks like r254025 is actually the culprit.  Specifically, the new
mallocinit()/kmeminit() now invoke the new vmem_init() before
uma_startup2(), which allocates 16 zones out of the boot pages if I am
reading this correctly.  This is all done before uma_startup2() is called,
triggering the panic.

Anything less than 28 CPUs, and the zone size (uma_zone + uma_cache *
(mp_maxid + 1)) is <= PAGE_SIZE and we can successfully boot.  So at 32
CPUs, we need two boot pages per zone which consumes more than the default
64 boot pages.  The size of these structures do not appear to have
materially changed any time recently.

Scaling with MAXCPU seems to be an OK solution, but should it be based
directly on the size of (uma_zone + uma_cache * MAXCPU)?  I am not very
familiar with uma startup, but it seems like these zones are the primary
consumers of the boot pages, so the UMA_BOOT_PAGES default should be based
directly on that size..

Regards,

-Jim
Received on Tue Aug 13 2013 - 23:04:02 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:40 UTC