UMA initialization failure with 48 core ARM64

From: Michał Stanek <mst_at_semihalf.com>
Date: Fri, 15 May 2015 20:30:35 +0200
Hi,

I am experiencing an early failure of UMA on an ARM64 platform with 48
cores enabled. I get a kernel panic during initialization of VM. Here is
the boot log (lines with 'MST:' are my own debug printfs).

Copyright (c) 1992-2015 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.0-CURRENT #333 52fd91e(smp_48)-dirty: Fri May 15 18:26:56 CEST
2015
    mst_at_arm64-prime:/usr/home/mst/freebsd_v8/obj_kernel/arm64.aarch64/usr/home/mst/freebsd_v8/kernel/sys/THUNDER-88XX
arm64
FreeBSD clang version 3.6.0 (tags/RELEASE_360/final 230434) 20150225
MST: in vm_mem_init()
MST: in vmem_init() with param *vm == kernel_arena
MST: in vmem_xalloc() with param *vm == kernel_arena
MST: in vmem_xalloc() with param *vm == kmem_arena
panic: mtx_lock() of spin mutex (null) _at_
/usr/home/mst/freebsd_v8/kernel/sys/kern/subr_vmem.c:1165
cpuid = 0
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      0xffffff80001f4f80:

The kernel boots fine when MAXCPU is set to 30 or lower, but the error
above always appears when it is set to a higher value.

The panic is triggered by a KASSERT in __mtx_lock_flags() which is called
with the macro VMEM_LOCK(vm) in vmem_xalloc(). This is line 1143 in
subr_vmem.c (log shows different line number due to added printfs).
It looks like the lock belongs to 'kmem_arena' which is uninitialized at
this point (kmeminit() has not been called yet).

While debugging, I tried modifying VM code as a quick workaround. I
replaced the number of cores to 1 wherever mp_ncpus, mp_maxid or MAXCPU
(and others) are read. This, I believe, limits UMA per-cpu caches to just
one, while the rest of the OS (scheduler, etc) sees all 48 cores.
In addition, I changed UMA_BOOT_PAGES in sys/vm/uma_int.h to 512 (default
was 64).
With these tweaks, I got a successful (but not really stable) boot with 48
cores. Of course these are dirty hacks and a proper solution is needed.

I am a bit surprised that the kernel fails with MAXCPU==48 as the amd64
arch has this value set to '256' and I have read posts that other platforms
with even more cores have worked fine. Perhaps I need to tweak some other
VM parameters, apart from UMA_BOOT_PAGES (AKA vm.boot_pages), but I am not
sure how.

I included a full stacktrace and a more verbose log (with UMA_DEBUG macros
enabled) in the attachment. There is also a diff of the hacks I used while
debugging.

Best regards,
Michal Stanek

Received on Fri May 15 2015 - 16:30:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:57 UTC