Re: rpi2 hangup during poudriere build: lots of pfault wmseg status

From: Laurent Cimon <laurent_at_nuxi.ca>
Date: Wed, 6 Dec 2017 16:54:36 -0500
> On Dec 6, 2017, at 00:57, Mark Millard <markmi_at_dsl-only.net> wrote:
> 
> I tried to build some ports on a rpi2
> (via poudriere) but it hung up:
> Ethernet and normal console use. (Note:
> the root file system is on a USB SSD
> and the swap partition is also on that
> USB SSD.)
> 
> But ~^b worked for getting to the db>
> prompt on the console.
> 
> From there a ps suggests that it got hung
> up in pfault activity. (Possibly insufficient
> RAM+swap-partition space?) But it is not
> clear to me that it should end up hung up
> vs. killing processes or other such.

Hi,

From what I know the raspberry pis use the same controller for ethernet and
the USB hub on which you’re hosting an SSD. It seems like you make very heavy
use of the USB ports, and all of the resources used by poudriere except for the
CPU and the (very limited) memory that’s not in swap is attached to them. If you
really didn’t have enough memory and swap, the linkers would’ve been stopped.

I think it might just be a swap death. Poudriere compiles and fetches in parallel
a lot, ethernet and disk I/O is slow because it’s very limited, so linking takes
longer. You end up linking a few very big binaries at the same time, and they
all fight for the memory, to get out of swap through page faults, but there
are too many page faults, all too big, requesting for more CPU time that’s
allowed to them.

This would explain why you have 3 linkers waiting on a page fault out of the 4
CPUs poudriere allows builds on, on top of the awk processes. It would also
explain why you had easy access to the debugger: it was in memory already with
the kernel.

I’d advise you to disable parallel builds and see if it happens again,
but it would make building much slower. Using makejobs would help if you
can afford watching the build. Otherwise be patient, it should resolve itself
eventually, but it will take a while and it will happen again.

Good luck,

Laurent
Received on Wed Dec 06 2017 - 21:04:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:14 UTC