Re: A reliable port cross-build failure (hangup) in my context (amd64->armv7 cross build, with native-tool speedup involved)

From: Mark Millard <marklmi_at_yahoo.com>
Date: Fri, 28 Dec 2018 12:12:06 -0800
On 2018-Dec-28, at 05:13, Michal Meloun <melounmichal at gmail.com> wrote:

> Mark,
> this is known problem with qemu-user-static.
> Emulation of every single interruptible syscall is broken by design (it
> have signal related races). Theses races cannot be solved without major
> rewrite of syscall emulation code.
> Unfortunately, nobody actively works on this, I think.
> 

Thanks for the note setting some expectations.

On the evidence that I have I expect that more is going on than that:

A) The hang-up always happens and always in the same place. So
it would appear that no race is involved.

B) (A) is true even for varying the number of builders in parallel
(so other builds also happening) and the number of jobs allowed per
builder. It also fails for only one builder allowed only one process.
(I get traces from that last kind of context.)

C) The problem started on the package-building servers for armv7
and armv6 without qemu-user-static having an update (FreeBSD and
cmake had updates, for example).

D) The problem is only observed for targeting armv7 and armv6 as
far as I can tell. I've never seen it for aarch64, neither my
own builds nor when I looked at the package-building server
history.

At least that is what got me started. (I've since learned that
qemu-user-static uses fork in place of a requested vfork.)

My ktrace/kdump experiment yesterday showed something odd for the
kevent that hangs in cmake:

93172 qemu-arm-static CALL  kevent(0x3,0x7ffffffe7d40,0x2,0x7ffffffd7d40,0x400,0)
93172 qemu-arm-static STRU  struct kevent[] = { { ident=6, filter=EVFILT_READ, flags=0x1<EV_ADD>, fflags=0, data=0, udata=0x0 }
             { ident=0x0, filter=<invalid=0>, flags=0, fflags=0x8, data=0x1ffff, udata=0x0 } }

Note the 0x2 argument to kevent and the apparently-odd 2nd entry in the struct
kevent[]. The kevent use is from cmake.

So far I've not identified a signal being delivered at a time that would seem
to me to be likely to contribute. (But this is not familiar code so my judgment
is likely not the best.)

Note: I normally run FreeBSD using a non-debug kernel, even when using
head. (The kernel does have symbols.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Fri Dec 28 2018 - 19:22:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:19 UTC