New NUMA support coming to CURRENT

From: Jeff Roberson <jroberson_at_jroberson.net>
Date: Tue, 9 Jan 2018 09:46:54 -1000 (HST)
Hello folks,

I am working on merging improved NUMA support with policy implemented by 
cpuset(2) over the next week.  This work has been supported by Dell/EMC's 
Isilon product division and Netflix.  You can see some discussion of these 
changes here:

https://reviews.freebsd.org/D13403
https://reviews.freebsd.org/D13289
https://reviews.freebsd.org/D13545

The work has been done in user/jeff/numa if you want to look at svn 
history or experiment with the branch.  It has been tested by Peter Holm 
on i386 and amd64 and it has been verified to work on arm at various 
points.

We are working towards compatibility with libnuma and linux mbind.  These 
commits will bring in improved support for NUMA in the kernel.  There are 
new domain specific allocation functions available to kernel for UMA, 
malloc, kmem_, and vm_page*.  busdmamem consumers will automatically be 
placed in the correct domain, bringing automatic improvements to some 
device performance.

cpuset will be able to constrains processes, groups of processes, jails, 
etc. to subsets of the system memory domains, just as it can with sets of 
cpus.  It can set default policy for any of the above.  Threads can use 
cpusets to set policy that specifies a subset of their visible domains.

Available policies are first-touch (local in linux terms), round-robin 
(similar to linux interleave), and preferred.  For now, the default is 
round-robin.  You can achieve a fixed domain policy by using round-robin 
with a bitmask of a single domain.  As the scheduler and VM become more 
sophisticated we may switch the default to first-touch as linux does.

Currently these features are enabled with VM_NUMA_ALLOC and MAXMEMDOM.  It 
will eventually be NUMA/MAXMEMDOM to match SMP/MAXCPU.  The current NUMA 
syscalls and VM_NUMA_ALLOC code was 'experimental' and will be deprecated. 
numactl will continue to be supported although cpuset should be preferred 
going forward as it supports the full feature set of the new API.

Thank you for your patience as I deal with the inevitable fallout of such 
sweeping changes.  If you do have bugs, please file them in bugzilla, or 
reach out to me directly.  I don't always have time to catch up on all of 
my mailing list mail and regretfully things slip through the cracks when 
they are not addressed directly to me.

Thanks,
Jeff
Received on Tue Jan 09 2018 - 18:51:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:14 UTC