On January 27, 2020 2:25:59 PM PST, Mark Millard <marklmi_at_yahoo.com> wrote: > > >On 2020-Jan-27, at 12:48, Cy Schubert <Cy.Schubert at cschubert.com> >wrote: > >> In message <BA0CE7D8-CFA1-40A3-BEFA-21D0C230B082_at_yahoo.com>, Mark >Millard >> write >> s: >>> >>> >>> >>> On 2020-Jan-27, at 10:20, Cy Schubert <Cy.Schubert at cschubert.com> >wrote: >>> >>>> On January 27, 2020 5:09:06 AM PST, Cy Schubert ><Cy.Schubert_at_cschubert.com> >>> wrote: >>>>>> . . . >>>>> >>>>> Setting a lower arc_max at boot is unlikely to help. Rust was >building >>>>> on >>>>> the 8 GB and 5 GB 4 core machines last night. It completed >successfully >>>>> on >>>>> the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. >>>>> >>>>> On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap >was >>>>> used. ARC was reported at 941 MB. arc_min on this machine is 489.2 >MB. >>>> >>>> MAKE_JOBS_NUMBER=3 worked building rust on the 5 GB 4 core >machine. ARC is >>> at 534 MB with 12 MB swap used. >>> >>> If you increase vm.pageout_oom_seq to, say, 10 times what you now >use, >>> does MAKE_JOBS_NUMBER=4 complete --or at least go notably longer >before >>> getting OOM behavior from the system? (The default is 12 last I >checked. >>> So that might be what you are now using.) >> >> It's already 4096 (default is 12). > >Wow. Then the count of tries to get free RAM above the threshold >does not seem likely to be the source of the OOM kills. > >>> >>> Have you tried also having: vm.pfault_oom_attempts="-1" (Presuming >>> you are not worried about actually running out of swap/page space, >>> or can tolerate a deadlock if it does run out.) This setting >presumes >>> head, not release or stable. (Last I checked anyway.) >> >> Already there. > >Then page-out delay does not seem likely to be the source of the OOM >kills. > >> The box is a sandbox with remote serial console access so deadlocks >are ok. >> >>> >>> It would be interesting to know what difference those two settings >>> together might make for your context: it seems to be a good context >>> for testing in this area. (But you might already have set them. >>> If so, it would be good to report the figures in use.) >>> >>> Of course, my experiment ideas need not be your actions. >> >> It's a sandbox machine. We already know 8 GB works with 4 threads on >as >> many cores. And, 5 GB works with 3 threads on 4 cores. > >It would be nice to find out what category of issue in the kernel >is driving the OOM kills for your 5GB context with MAKE_JOBS_NUMBER=4. >Too bad the first kill does not report a backtrace spanning the >code choosing to do the kill (or otherwise report the type of issue >leading the the kill). > >Your is consistent with the small arm board folks reporting that >recently >contexts that were doing buildworld and the like fine under somewhat >older kernels have started getting OOM kills, despite the two settings. > >At the moment I'm not sure how to find the category(s) of issue(s) that >is(are) driving these OOM kills. > >Thanks for reporting what settings you were using. > >=== >Mark Millard >marklmi at yahoo.com >( dsl-only.net went >away in early 2018-Mar) I've been able to reproduce the problem at $JOB in a Virtualbox VM with 1 vCPU, 1.5 GB vRAM, and 2 GB swap building graphics/graphviz: cc killed out of swap space. The killed cc had an address space of ~ 500 MB, using only 43 MB of the 2 GB swap. Free space is exhausted but swap used never exceeds tens of MB. Doubling the swap to 4 GB had no effect. The VM doesn't use ZFS. This appears recent. -- Pardon the typos and autocorrect, small keyboard in use. Cy Schubert <Cy.Schubert_at_cschubert.com> FreeBSD UNIX: <cy_at_FreeBSD.org> Web: https://www.FreeBSD.org The need of the many outweighs the greed of the few. Sent from my Android device with K-9 Mail. Please excuse my brevity.Received on Tue Jan 28 2020 - 18:34:00 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC