blubee blubeeme gurenchan at gmail.com wrote on Mon Aug 20 03:02:01 UTC 2018 : > I am running current compiling LLVM60 and when it comes to linking > basically all the processes on my computer gets killed; Chrome, Firefox and > some of the LLVM threads as well > . . . > last pid: 20965; load averages: 0.64, 5.79, 7.73 > up 12+01:35:46 11:00:36 > 76 processes: 1 running, 75 sleeping > CPU: 0.8% user, 0.5% nice, 1.0% system, 0.0% interrupt, 98.1% idle > Mem: 10G Active, 3G Inact, 100M Laundry, 13G Wired, 6G Free > ARC: 4G Total, 942M MFU, 1G MRU, 1M Anon, 43M Header, 2G Other > 630M Compressed, 2G Uncompressed, 2.74:1 Ratio > Swap: 2G Total, 1G Used, 739M Free, 63% Inuse > . . . The timing of that top output relative to the first or any OOM kill of a process is not clear. After? Just before? How long before? What it is like leading up to the first kill is of interest. Folks that deal with this are likely to want do know if you got console messages ( or var/log/messages content) such as: pid 49735 (c++), uid 0, was killed: out of swap space (Note: "out of swap space" can be a misnomer for having low Free RAM for "too long" [vm.pageout_oom_seq based], even with swap unused or little used.) And: Were you also getting messages like: swap_pager_getswapspace(4): failed and/or: swap_pager: out of swap space (These indicate the "killed: out of swap space" is not necessarily a misnomer relative to swap space, even if low free RAM over a time drives the process kills.) How about messages like: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 28139, size: 65536 or any I/O error reports or retry reports? Notes: Mark Johnston published a patch used for some investigations of the OOM killing: https://people.freebsd.org/~markj/patches/slow_swap.diff But this is tied to the I/O swap latencies involved and if they are driving some time frames. It just adds more reporting to the console ( and /var/log/messages ). It is not a fix. IT may not be likely to report much for your context. vm.pageout_oom_seq controls the "how long is low free RAM tolerated" (my hprasing), though the units are not directly time. In various arm contexts with small boards going from the default of 12 to 120 allowed things to complete or get much farther. So: sysctl vm.pageout_oom_seq=120 but 120 is not the limit: it is a C int parameter. I'll note that "low free RAM" is as FreeBSD classifies it, whatever the details are. Most of the arm examples have been small memory contexts and many of them likely avoid ZFS and use UFS instead. ZFS and its ARC and such an additional complicated context to the type of issue. There are lots of reports around of the ARC growing too big. I do not know the status of -r336196 relative to ZFS/ARC memory management or if more recent versions have improvements. (I do not use ZFS normally.) I've seen messages making suggestions for controlling the growth but I'm no ZFS expert. Just to give an idea what is sufficient to build devel/llvm60: I will note that on a Pine64+ 2GB (so 2 GiBytes of RAM in a aarch64 context with 4 cores, 1 HW-thread per core) running -r337400, and using UFS on a USB drive and a swap partition that drive too, I have built devel/llvm60 2 times via poudriere-devel: just one builder allowed but it being allowed to use all 4 cores in parallel, about 14.5 hr each time. (Different USB media each time.) This did require the: sysctl vm.pageout_oom_seq=120 Mark Johnston's slow_swap.diff patch code did not report any I/O latency problems in the swap subsystem. I've also built lang/gcc8 2 times, about 12.5 hrs each time. No ZFS, no ARC, no Chrome, no FireFox. Nothing else major going on beyond the devel/llvm60 build (or, later, the lang/gcc8 build) in each case. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)Received on Mon Aug 20 2018 - 04:26:00 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC