numa involved in instability and swap usage despite RAM free?

From: Alexander Leidinger <Alexander_at_leidinger.net>
Date: Sun, 24 Jun 2018 12:03:29 +0200
Hi,

I don't have hard evidence, but there is enough "smell" to open up a  
discussion...

Short:
Can it be that enabling numa in the kernel is the reason why some  
people see instability with zfs and usage of swap while a lot of free  
RAM is available?

Long:
I have a dual-socket Xeon system (E5620 + L5630... yes, not the same,  
but compatible enough to be able to run together) with 64 GB RAM. I  
run -current on it (currently it's at r333966 and it was for all the  
tests below).

What I see with numa enabled and no zfs patches is, that at some point  
I have half the RAM free, swap is in use, and after a lot of compiling  
ports in different jails ZFS comes to a halt (sometimes I can unblock  
by killing a compile, sometimes I can't even kill, only way out is  
power-cycle). I've seen this around twice a week.

When I keep numa enabled and have applied this ZFS patch  
https://reviews.freebsd.org/D7538 the bahavior changes. AFter a while  
half of the RAM is free, swap is in use, and after enough compiling  
ports in jails I get a panic (unfortunately not enough debug info in  
the textdump to know exactly what he problem is).

Since 2 weeks I have numa compiled out of the kernel (and still the  
ZFS patch inside). The system is down to 17 GB free and NO swap in  
use. I'm compiling ports in 16 jails (one of them with parts of KDE5 =  
currently about 700 ports compiled) and not a single issue like the  
above.

For everyone with swap issues or ZFS issues similar to the ones I  
see... do you have numa enabled and can you please try without and  
report back?

Can it be that if memory request can not be fulfilled from one numa  
domain, there is no fallback to another numa domain for all the  
various kinds of memory allocation we have in the kernel  
(contigmem/no-sleep/...)?

Bye,
Alexander.

-- 
http://www.Leidinger.net Alexander_at_Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org    netchild_at_FreeBSD.org  : PGP 0x8F31830F9F2772BF

Received on Sun Jun 24 2018 - 08:03:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC