Re: After update to r357104 build of poudriere jail fails with 'out of swap space'

From: Cy Schubert <Cy.Schubert_at_cschubert.com>
Date: Tue, 28 Jan 2020 11:33:30 -0800
On January 27, 2020 2:25:59 PM PST, Mark Millard <marklmi_at_yahoo.com> wrote:
>
>
>On 2020-Jan-27, at 12:48, Cy Schubert <Cy.Schubert at cschubert.com>
>wrote:
>
>> In message <BA0CE7D8-CFA1-40A3-BEFA-21D0C230B082_at_yahoo.com>, Mark
>Millard 
>> write
>> s:
>>> 
>>> 
>>> 
>>> On 2020-Jan-27, at 10:20, Cy Schubert <Cy.Schubert at cschubert.com>
>wrote:
>>> 
>>>> On January 27, 2020 5:09:06 AM PST, Cy Schubert
><Cy.Schubert_at_cschubert.com>
>>> wrote:
>>>>>> . . . 
>>>>> 
>>>>> Setting a lower arc_max at boot is unlikely to help. Rust was
>building
>>>>> on 
>>>>> the 8 GB and 5 GB 4 core machines last night. It completed
>successfully
>>>>> on 
>>>>> the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB.
>>>>> 
>>>>> On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap
>was 
>>>>> used. ARC was reported at 941 MB. arc_min on this machine is 489.2
>MB.
>>>> 
>>>> MAKE_JOBS_NUMBER=3 worked building rust on the 5  GB 4 core
>machine. ARC is
>>> at 534 MB with 12 MB swap used.
>>> 
>>> If you increase vm.pageout_oom_seq to, say, 10 times what you now
>use,
>>> does MAKE_JOBS_NUMBER=4 complete --or at least go notably longer
>before
>>> getting OOM behavior from the system? (The default is 12 last I
>checked.
>>> So that might be what you are now using.)
>> 
>> It's already 4096 (default is 12).
>
>Wow. Then the count of tries to get free RAM above the threshold
>does not seem likely to be the source of the OOM kills.
>
>>> 
>>> Have you tried also having: vm.pfault_oom_attempts="-1" (Presuming
>>> you are not worried about actually running out of swap/page space,
>>> or can tolerate a deadlock if it does run out.) This setting
>presumes
>>> head, not release or stable. (Last I checked anyway.)
>> 
>> Already there.
>
>Then page-out delay does not seem likely to be the source of the OOM
>kills.
>
>> The box is a sandbox with remote serial console access so deadlocks
>are ok.
>> 
>>> 
>>> It would be interesting to know what difference those two settings
>>> together might make for your context: it seems to be a good context
>>> for testing in this area. (But you might already have set them.
>>> If so, it would be good to report the figures in use.)
>>> 
>>> Of course, my experiment ideas need not be your actions.
>> 
>> It's a sandbox machine. We already know 8 GB works with 4 threads on
>as 
>> many cores. And, 5 GB works with 3 threads on 4 cores.
>
>It would be nice to find out what category of issue in the kernel
>is driving the OOM kills for your 5GB context with MAKE_JOBS_NUMBER=4.
>Too bad the first kill does not report a backtrace spanning the
>code choosing to do the kill (or otherwise report the type of issue
>leading the the kill).
>
>Your is consistent with the small arm board folks reporting that
>recently
>contexts that were doing buildworld and the like fine under somewhat
>older kernels have started getting OOM kills, despite the two settings.
>
>At the moment I'm not sure how to find the category(s) of issue(s) that
>is(are) driving these OOM kills.
>
>Thanks for reporting what settings you were using.
>
>===
>Mark Millard
>marklmi at yahoo.com
>( dsl-only.net went
>away in early 2018-Mar)

I've been able to reproduce the problem at $JOB in a Virtualbox VM with 1 vCPU, 1.5 GB vRAM, and 2 GB swap building graphics/graphviz: cc killed out of swap space. The killed cc had an address space of ~ 500 MB, using only 43 MB of the 2 GB swap. Free space is exhausted but swap used never exceeds tens of MB. Doubling the swap to 4 GB had no effect. The VM doesn't use ZFS.

This appears recent.


-- 
Pardon the typos and autocorrect, small keyboard in use. 
Cy Schubert <Cy.Schubert_at_cschubert.com>
FreeBSD UNIX: <cy_at_FreeBSD.org> Web: https://www.FreeBSD.org

The need of the many outweighs the greed of the few.

Sent from my Android device with K-9 Mail. Please excuse my brevity.
Received on Tue Jan 28 2020 - 18:34:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC