Re: Strange ARC/Swap/CPU on yesterday's -CURRENT

From: Don Lewis <truckman_at_FreeBSD.org> Date: Sun, 1 Apr 2018 12:53:32 -0700 (PDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC

On 27 Mar, Andriy Gapon wrote:
> On 24/03/2018 01:21, Bryan Drewery wrote:
>> On 3/20/2018 12:07 AM, Peter Jeremy wrote:
>>>
>>> On 2018-Mar-11 10:43:58 -1000, Jeff Roberson <jroberson_at_jroberson.net> wrote:
>>>> Also, if you could try going back to r328953 or r326346 and let me know if 
>>>> the problem exists in either.  That would be very helpful.  If anyone is 
>>>> willing to debug this with me contact me directly and I will send some 
>>>> test patches or debugging info after you have done the above steps.
>>>
>>> I ran into this on 11-stable and tracked it to r326619 (MFC of r325851).
>>> I initially got around the problem by reverting that commit but either
>>> it or something very similar is still present in 11-stable r331053.
>>>
>>> I've seen it in my main server (32GB RAM) but haven't managed to reproduce
>>> it in smaller VBox guests - one difficulty I faced was artificially filling
>>> ARC.
> 
> First, it looks like maybe several different issues are being discussed and
> possibly conflated in this thread.  I see reports related to ZFS, I see reports
> where ZFS is not used at all.  Some people report problems that appeared very
> recently while others chime in with "yes, yes, I've always had this problem".
> This does not help to differentiate between problems and to analyze them.
> 
>> Looking at the ARC change you referred to from r325851
>> https://reviews.freebsd.org/D12163, I am convinced that ARC backpressure
>> is completely broken.
> 
> Does your being convinced come from the code review or from experiments?
> If the former, could you please share your analysis?
> 
>> On my 78GB RAM system with ARC limited to 40GB and
>> doing a poudriere build of all LLVM and GCC packages at once in tmpfs I
>> can get swap up near 50GB and yet the ARC remains at 40GB through it
>> all.  It's always been slow to give up memory for package builds but it
>> really seems broken right now.
> 
> Well, there are multiple angles.  Maybe it's ARC that does not react properly,
> or maybe it's not being asked properly.
> 
> It would be useful to monitor the system during its transition to the state that
> you reported.  There are some interesting DTrace probes in ARC, specifically
> arc-available_memory and arc-needfree are first that come to mind.  Their
> parameters and how frequently they are called are of interest.  VM parameters
> could be of interest as well.
> 
> A rant.
> 
> Basically, posting some numbers and jumping to conclusions does not help at all.
> Monitoring, graphing, etc does help.
> ARC is a complex dynamic system.
> VM (pagedaemon, UMA caches) is a complex dynamic system.
> They interact in a complex dynamic ways.
> Sometimes a change in ARC is incorrect and requires a fix.
> Sometimes a change in VM is incorrect and requires a fix.
> Sometimes a change in VM requires a change in ARC.
> These three kinds of problems can all appear as a "problem with ARC".
> 
> For instance, when vm.lowmem_period was introduced you wouldn't find any mention
> of ZFS/ARC.  But it does affect period between arc_lowmem() calls.
> 
> Also, pin-pointing a specific commit requires proper bisecting and proper
> testing to correctly attribute systemic behavior changes to code changes.

I just upgraded my main package build box (12.0-CURRENT, 8 cores, 32 GB
RAM) from r327616 to r331716.  I was seeing higher swap usage and larger
ARC sizes before the upgrade than I remember from the distant past, but
ARC was still at least somewhat responsive to memory pressure and I
didn't notice any performance issues.

After the upgrade, ARC size seems to be pretty unresponsive to memory
demand.  Currently the machine is near the end of a poudriere run to
build my usual set of ~1800 ports.  The only currently running build is
chromium and the machine is paging heavily.  Settings of interest are:
  USE_TMPFS="wrkdir data localbase"
  ALLOW_MAKE_JOBS=yes

last pid: 96239;  load averages:  1.86,  1.76,  1.83    up 3+14:47:00  12:38:11
108 processes: 3 running, 105 sleeping
CPU: 18.6% user,  0.0% nice,  2.4% system,  0.0% interrupt, 79.0% idle
Mem: 129M Active, 865M Inact, 61M Laundry, 29G Wired, 1553K Buf, 888M Free
ARC: 23G Total, 8466M MFU, 10G MRU, 5728K Anon, 611M Header, 3886M Other
     17G Compressed, 32G Uncompressed, 1.88:1 Ratio
Swap: 40G Total, 17G Used, 23G Free, 42% Inuse, 4756K In

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAN
96239 nobody           1  76    0   140M 93636K CPU5    5   0:01  82.90% clang-
96238 nobody           1  75    0   140M 92608K CPU7    7   0:01  80.81% clang-
 5148 nobody           1  20    0   590M   113M swread  0   0:31   0.29% clang-
57290 root             1  20    0 12128K  2608K zio->i  7   8:11   0.28% find
78958 nobody           1  20    0   838M   299M swread  0   0:23   0.19% clang-
97840 nobody           1  20    0   698M   140M swread  4   0:27   0.13% clang-
96066 nobody           1  20    0   463M   104M swread  1   0:32   0.12% clang-
11050 nobody           1  20    0   892M   154M swread  2   0:39   0.12% clang-

Pre-upgrade I was running r327616, which is newer than either of the
commits that Jeff mentioned above.  It seems like there has been a
regression since then.

I also don't recall seeing this problem on my Ryzen box, though it has
2x the core count and 2x the RAM.  The last testing that I did on it was
with r329844.