spurious out of swap kills

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Thu, 12 Sep 2019 16:00:17 -0700 (PDT)
My poudriere machine is running 13.0-CURRENT and gets updated to the
latest version of -CURRENT periodically.  At least in the last week or
so, I've been seeing occasional port build failures when building my
default set of ports, and I finally had some time to do some
investigation.

It's a 16-thread Ryzen machine, with 64 GB of RAM and 40 GB of swap.
Poudriere is configured with
  USE_TMPFS="wrkdir data localbase"
and I have
  .if ${.CURDIR:M*/www/chromium}
  MAKE_JOBS_NUMBER=16
  .else
  MAKE_JOBS_NUMBER=7
  .endif
in /usr/local/etc/poudriere.d/make.conf, since this gives me the best
overall build time for my set of ports.  This hits memory pretty hard,
especially when chromium, firefox, libreoffice, and both versions of
openoffice are all building at the same time.  During this time, the
amount of space consumed by tmpfs for /wrkdir gets large when building
these large ports.  There is not enough RAM to hold it all, so some of
the older data spills over to swap.  Swap usage peaks at about 10 GB,
leaving about 30 GB of free swap.  Nevertheless, I see these errors,
with rustc being the usual victim:

Sep 11 23:21:43 zipper kernel: pid 16581 (rustc), jid 43, uid 65534, was killed: out of swap space
Sep 12 02:48:23 zipper kernel: pid 1209 (rustc), jid 62, uid 65534, was killed: out of swap space

Top shows the size of rustc being about 2 GB, so I doubt that it
suddenly needs an additional 30 GB of swap.

I'm wondering if there might be a transient kmem shortage that is
causing a malloc(..., M_NOWAIT) failure in the swap allocation path
that is the cause of the problem.
Received on Thu Sep 12 2019 - 21:00:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC