Re: FreeBSD head -r341836 amd64->aarch64 cross-build of -r484783 ports via poudriere: devel/qt5-testlib hung-up during "Checking for POSIX monotonic clock"

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 20 Dec 2018 12:28:51 -0800
[A amd64->armv7 cross build shows interesting hang-up behavior as
well, apparently highly repeatable for my current context.]

On 2018-Dec-19, at 16:21, Mark Millard <marklmi at yahoo.com> wrote:

> [I attached to the hung-up process with gdb and looked around a little.]
> 
> On 2018-Dec-19, at 13:58, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> [Looks like a race or some such for devel/qt5-testlib: retry of poudreire-devel
>> did not hang. The other hang-up seems to be repeating and I give some details.]
>> 
>> On 2018-Dec-19, at 12:20, Mark Millard <marklmi at yahoo.com> wrote:
>> 
>>> FYI: Based on FreeBSD head -r341836 (host and target) and ports -r484783 . This
>>> was a rebuild based on going from perl5.26 to perl5.28 without updating the ports
>>> tree and from system clang 6 for the prior FreeBSD-head context used to clang 7
>>> this time. (I'm not attributing causes here.) poudriere was using amd64-native
>>> tools for speeding up the cross-build.
>>> 
>>> # grep -r =perl5= /etc/ ~/src.configs/ /usr/local/etc/
>>> /etc/make.conf:DEFAULT_VERSIONS+=perl5=5.28 gcc=8
>>> /usr/local/etc/poudriere.d/make.conf:DEFAULT_VERSIONS+=perl5=5.28 gcc=8
>>> 
>>> There was also a "print/texinfo:configure/runaway" but I've not looked into
>>> it at all yet and it may be a while before I do. The other ports attempted
>>> built fine as far as I can tell so far.
>>> 
>>> 
>>> The devel/qt5-testlib failure looks like:
>>> 
>>> [00:00:13] Building 123 packages using 28 builders
>>> . . .
>>> [00:49:30] [10] [00:00:00] Building devel/qt5-testlib | qt5-testlib-5.11.2
>>> . . .
>>> [07:31:31] [10] [06:42:01] Saved devel/qt5-testlib | qt5-testlib-5.11.2 wrkdir to: /usr/local/poudriere/data/wrkdirs/FBSDFSSDjailCortexA57-default/default/qt5-testlib-5.11.2.tar
>>> [07:31:32] [10] [06:42:02] Finished devel/qt5-testlib | qt5-testlib-5.11.2: Failed: configure/runaway
>>> 
>>> With logs/errors/qt5-testlib-5.11.2.log showing:
>>> 
>>> Checking for POSIX monotonic clock... 
>>> + cd /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.tests/clock-monotonic && /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/bin/qmake "CONFIG -= qt debug_and_release app_bundle lib_bundle" "CONFIG += shared warn_off console single_arch" /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.tests/clock-monotonic
>>> + cd /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.tests/clock-monotonic && MAKEFLAGS= make
>>> =>> Killing runaway build after 21600 seconds with no output
>>> =>> Cleaning up wrkdir
>>> ===>  Cleaning for qt5-testlib-5.11.2
>>> Killed
>>> build of devel/qt5-testlib | qt5-testlib-5.11.2 ended at Wed Dec 19 06:45:42 PST 2018
>>> build time: 06:41:46
>>> !!! build failure encountered !!!
>>> 
>>> 
>>> # less /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.log
>>> . . .
>>> test config.qtbase_corelib.libraries.librt succeeded
>>> executing config test clock-monotonic
>>> + cd /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.tests/clock-monotonic && /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/bin/qmake "CONFIG -= qt debug_and_release app_bundle lib_bundle" "CONFIG += shared warn_off console single_arch" /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.tests/clock-monotonic
>>> + cd /wrkdirs/usr/ports/devel/qt5-testlib/work/qtbase-everywhere-src-5.11.2/config.tests/clock-monotonic && MAKEFLAGS= make
>>> 
>>> 
>>> Some supporting details of context:
>>> 
>>> # uname -apKU
>>> FreeBSD FBSDFSSD 13.0-CURRENT FreeBSD 13.0-CURRENT #5 r341836M: Tue Dec 11 16:37:42 PST 2018     markmi_at_FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG  amd64 amd64 1300005 1300005
>>> 
>>> # svnlite info /usr/ports/ | grep "Re[plv]"
>>> Relative URL: ^/head
>>> Repository Root: svn://svn.freebsd.org/ports
>>> Repository UUID: 35697150-7ecd-e111-bb59-0022644237b5
>>> Revision: 484783
>>> Last Changed Rev: 484783
>>> 
>> 
>> I started poudriere up again with just the 2 needing to be rebuilt (plus
>> what depends on the 2). devel/qt5-testlib quickly completed just fine:
>> 
>> [00:02:16] [02] [00:00:00] Building devel/qt5-testlib | qt5-testlib-5.11.2
>> [00:04:54] [02] [00:02:38] Finished devel/qt5-testlib | qt5-testlib-5.11.2: Success
>> 
>> 
>> In the prior build that had the hang-ups I looked and dor print/texinfo :
>> 
>> /wrkdirs/usr/ports/print/texinfo/work/texinfo-6.5/config.log shows for its
>> hang-up:
>> 
>> . . .
>> configure:6639: checking for alloca
>> configure:6676: /nxb-bin/usr/bin/cc -o conftest -O2 -pipe -mcpu=cortex-a57  -DLIBICONV_PLUG -g -fno-strict-aliasing  -mcpu=cortex-a57 -DLIBICONV_PLUG -D_THREAD_SAFE   conftest.c  >&5
>> configure:6676: $? = 0
>> configure:6684: result: yes
>> configure:6794: checking for C/C++ restrict keyword
>> configure:6821: /nxb-bin/usr/bin/cc -c -O2 -pipe -mcpu=cortex-a57  -DLIBICONV_PLUG -g -fno-strict-aliasing  -mcpu=cortex-a57 -DLIBICONV_PLUG -D_THREAD_SAFE conftest.c >&5
>> configure:6821: $? = 0
>> configure:6829: result: __restrict
>> configure:6844: checking whether // is distinct from /
>> 
>> 
>> In the poudriere re-run print/texinfo seems to be not progressing:
>> 
>> root       87913    0.0  0.0  12920  3668  0  I    13:29       0:00.06 | |           `-- sh: poudriere[FBSDFSSDjailCortexA57-default][01]: build_pkg (texinfo-6.5_1,1) (sh)
>> root       88869    0.0  0.0  12920  3660  0  I    13:29       0:00.00 | |             `-- sh: poudriere[FBSDFSSDjailCortexA57-default][01]: build_pkg (texinfo-6.5_1,1) (sh)
>> root       88870    0.0  0.0  10412  1848  0  IJ   13:29       0:00.01 | |               `-- /usr/bin/make -C /usr/ports/print/texinfo configure
>> root       88974    0.0  0.0  10272  1812  0  IJ   13:30       0:00.00 | |                 `-- /bin/sh -e -c (cd /wrkdirs/usr/ports/print/texinfo/work/texinfo-6.5 &&  _LATE_CONFIGURE_ARGS="" ;  if [ 
>> root       89283    0.0  0.0  11160  2108  0  IJ   13:30       0:00.10 | |                   `-- /bin/sh ./configure --enable-nls --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --di
>> root       89692    0.0  0.0 227368 14504  0  IJ   13:30       0:00.03 | |                     `-- /usr/local/bin/qemu-aarch64-static wc //dev/null
>> root       89694    0.0  0.0 227424 14596  0  IJ   13:30       0:00.01 | |                       `-- /usr/local/bin/qemu-aarch64-static wc //dev/null
>> root       89695    0.0  0.0 227584 14720  0  IJ   13:30       0:00.01 | |                         `-- wc: zygote (qemu-aarch64-static)
>> 
>> 
>> So it appears that:
>> 
>> /usr/local/bin/qemu-aarch64-static wc //dev/null
>> 
>> is hanging-up (again).
>> 
>> 
>> Given that these are hangups I'll note that this is a Ryzen
>> Threadripper 1950X context and is running under Hyper-V from
>> Windows 10's 1809 update. I gave it 28 logical processors and
>> have it to have the virtual NUMA topology match the topology of
>> the physical hardware: "Use Hardware Topology". (Processors
>> 28, NUMA nodes 2, Sockets 1, Hardware threads per core 2.)
> 
> Attaching to the stuck process via gdb and looking at the backtrace
> shows:
> 
> (gdb) attach 89695
> Attaching to program: /usr/local/bin/qemu-aarch64-static, process 89695
> [New LWP 101548 of process 89695]
> [Switching to LWP 100804 of process 89695]
> _pselect () at _pselect.S:3
> 3	PSEUDO(pselect)
> 
> (gdb) bt
> #0  _pselect () at _pselect.S:3
> #1  0x00000000601da57f in __thr_pselect (count=12, rfds=0x7ffffffe3650, wfds=0x0, efds=0x0, timo=0x0, mask=0x7ffffffe3600) at /usr/src/lib/libthr/thread/thr_syscalls.c:378
> #2  0x000000006004928d in do_freebsd_select (env=0x860edfb18, n=<optimized out>, rfd_addr=140736934698744, wfd_addr=<optimized out>, efd_addr=0, target_tv_addr=0)
>    at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-4ef7d07/bsd-user/freebsd/os-time.h:468
> #3  do_freebsd_syscall (cpu_env=0x860edfb18, num=93, arg1=12, arg2=140736934698744, arg3=0, arg4=0, arg5=0, arg6=274914043516, arg7=274913946564, arg8=6579811)
>    at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-4ef7d07/bsd-user/syscall.c:1106
> #4  0x000000006003903c in target_cpu_loop (env=0x860edfb18) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-4ef7d07/bsd-user/aarch64/target_arch_cpu.h:100
> #5  0x0000000060038e09 in cpu_loop (env=0xc) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-4ef7d07/bsd-user/main.c:121
> #6  0x0000000060039ecb in main (argc=<optimized out>, argv=0x7fffffffd360) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-4ef7d07/bsd-user/main.c:513
> 
> (gdb) up
> #1  0x00000000601da57f in __thr_pselect (count=12, rfds=0x7ffffffe3650, wfds=0x0, efds=0x0, timo=0x0, mask=0x7ffffffe3600) at /usr/src/lib/libthr/thread/thr_syscalls.c:378
> 378		ret = __sys_pselect(count, rfds, wfds, efds, timo, mask);
> (gdb) print *rfds
> $1 = {__fds_bits = {2048, 0 <repeats 15 times>}}
> 
> (gdb) info threads
>  Id   Target Id                   Frame 
> * 1    LWP 100804 of process 89695 _pselect () at _pselect.S:3
>  2    LWP 101548 of process 89695 _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 
> (gdb) thread 2 
> [Switching to thread 2 (LWP 101548 of process 89695)]
> #0  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 37	RSYSCALL_ERR(_umtx_op)
> 
> (gdb) bt
> #0  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1  0x00000000601d2ec0 in _thr_umtx_timedwait_uint (mtx=0x861027008, id=<optimized out>, clockid=<optimized out>, abstime=<optimized out>, shared=<optimized out>)
>    at /usr/src/lib/libthr/thread/thr_umtx.c:236
> #2  0x00000000601dc6f8 in cond_wait_user (cvp=<optimized out>, mp=0x860515b00, abstime=0x0, cancel=1) at /usr/src/lib/libthr/thread/thr_cond.c:307
> #3  cond_wait_common (cond=<optimized out>, mutex=<optimized out>, abstime=0x0, cancel=1) at /usr/src/lib/libthr/thread/thr_cond.c:367
> #4  0x00000000601438bc in qemu_futex_wait (ev=<optimized out>, val=4294967295) at util/qemu-thread-posix.c:350
> #5  qemu_event_wait (ev=0x62735d10 <rcu_call_ready_event>) at util/qemu-thread-posix.c:445
> #6  0x000000006014a92a in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:255
> #7  0x00000000601dc376 in thread_start (curthread=0x860518e00) at /usr/src/lib/libthr/thread/thr_create.c:291
> #8  0x0000000000000000 in ?? ()
> Backtrace stopped: Cannot access memory at address 0x7fffdfdfc000
> 


In a separate rebuild of 124 ports print/texinfo repeated its
problem.

In a rebuild targeting armv7, multimedia/gstreamer1-qt hung-up and timed
out. Another poudriere run also hung-up:

root       33719    0.0  0.0  12920  3528  0  I    11:40       0:00.03 | |           `-- sh: poudriere[FBSDFSSDjailArmV7-default][02]: build_pkg (gstreamer1-qt5-1.2.0_14) (sh)
root       41551    0.0  0.0  12920  3520  0  I    11:43       0:00.00 | |             `-- sh: poudriere[FBSDFSSDjailArmV7-default][02]: build_pkg (gstreamer1-qt5-1.2.0_14) (sh)
root       41552    0.0  0.0  10340  1744  0  IJ   11:43       0:00.01 | |               `-- /usr/bin/make -C /usr/ports/multimedia/gstreamer1-qt FLAVOR=qt5 build
root       41566    0.0  0.0  10236  1796  0  IJ   11:43       0:00.00 | |                 `-- /bin/sh -e -c (cd /wrkdirs/usr/ports/multimedia/gstreamer1-qt/work-qt5/.build; if ! /usr/bin/env QT_SELE
root       41567    0.0  0.0  89976 12896  0  IJ   11:43       0:00.07 | |                   `-- /usr/local/bin/qemu-arm-static ninja -j28 -v all
root       41585    0.0  0.0 102848 25056  0  IJ   11:43       0:00.10 | |                     |-- /usr/local/bin/qemu-arm-static /usr/local/bin/cmake -E cmake_autogen /wrkdirs/usr/ports/multimedia/g
root       41586    0.0  0.0 102852 25072  0  IJ   11:43       0:00.11 | |                     `-- /usr/local/bin/qemu-arm-static /usr/local/bin/cmake -E cmake_autogen /wrkdirs/usr/ports/multimedia/g

or as top showed it:

41552 root          1  52    0    10M  1744K    0 wait    15   0:00   0.00% /usr/bin/make -C /usr/ports/multimedia/gstreamer1-qt FLAVOR=qt5 build
41566 root          1  52    0    10M  1796K    0 wait     1   0:00   0.00% /bin/sh -e -c (cd /wrkdirs/usr/ports/multimedia/gstreamer1-qt/work-qt5/.build; if ! /usr/bin/env QT_SELECT=qt5 QMAKEMODULES
41567 root          2  52    0    88M    13M    0 select   4   0:00   0.00% /usr/local/bin/qemu-arm-static ninja -j28 -v all
41585 root          2  52    0   100M    24M    0 kqread   8   0:00   0.00% /usr/local/bin/qemu-arm-static /usr/local/bin/cmake -E cmake_autogen /wrkdirs/usr/ports/multimedia/gstreamer1-qt/work-qt5/.
41586 root          2  52    0   100M    24M    0 kqread  22   0:00   0.00% /usr/local/bin/qemu-arm-static /usr/local/bin/cmake -E cmake_autogen /wrkdirs/usr/ports/multimedia/gstreamer1-qt/work-qt5/.

So: waiting in kqread. Repeated tries have gotten the same result
so far.

If this keeps up, later I may be able to try a native FreeBSD boot
instead of Hyper-V use on the same machine.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Thu Dec 20 2018 - 19:29:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:19 UTC