Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork

From: Mark Millard <marklmi26-fbsd_at_yahoo.com>
Date: Sun, 18 Feb 2018 11:48:05 -0800
On 2018-Feb-18, at 10:08 AM, Mateusz Guzik <mjguzik at gmail.com> wrote:

> Can you please bisect this? There is another report stating that r329418 works fine.

I saw that Trond indicated an intent to test -r329418 but I've not seen
any reports about -r329418 or how much activity was used to make any
judgment about its status. But I can assume -r329418 is good if you
want.

Bisecting is likely going to be problematical for self-updates: builds
and installs and such can crash, making the installs risky. I do not
have an alternate builder for amd64 set up.

Even without that, it is not clear how many hours of build-related activity
it takes to have a high probability that the problem is gone. (I've seen
widely variable amounts of activity between failures in -r329465 .) It is
obvious to try an earlier version after failure but not obvious when to
try a later version.

My FreeBSD time is also rather limited (compared to historically over the
last few years), so the activity could be spread over parts of various
weekends, depending on how it goes.

>> On Sun, Feb 18, 2018 at 6:35 PM, Mark Millard <marklmi26-fbsd at yahoo.com> wrote:
>> 
>> On 2018-Feb-17, at 6:10 PM, Mark Millard <marklmi26-fbsd at yahoo.com> wrote:
>> 
>> > [Some more information added, from /usr/libexec/kgdb use.]
>> >
>> > On 2018-Feb-17, at 5:39 PM, Mark Millard <marklmi26-fbsd at yahoo.com> wrote:
>> >
>> >> This is for FreeBSD running under Hyper-V on a Windows 10 Pro machine.
>> >> The FreeBSD "disk" bindings are to SSDs, not the insides of NTFS files.
>> >> 29 logical processors assigned to FreeBSD (on a 32-thread Ryzen
>> >> Threadripper 1950X). No other Hyper-V use.
>> 
>> Trond's report seems to be for a "4 core" Intel i7 context (as seen
>> by FreeBSD in virtual box). So Ryzen seems to be non-essential for
>> reproduction.
>> 
>> Both of our reports are from some form of using FreeBSD in a virtual
>> machine (Hyper-V and VirtualBox). I do not know if that is a required
>> type of context or not.
>> 
>> >> This happened during:
>> >>
>> >> # ~/sys_build_scripts.amd64-host/make_powerpc64vtsc_nodebug_clang_altbinutils-amd64-host.sh check-old DESTDIR=/usr/obj/DESTDIRs/clang-powerpc64-installworld_altbinutils
>> >> Script started, output file is /root/sys_typescripts/typescript_make_powerpc64vtsc_nodebug_clang_altbinutils-amd64-host-2018-02-17:15:56:20
>> >>>>> Checking for old files
>> >>
>> 
>> I got another example but during a buildworld:
>> 
>> >>> Deleting stale files in build tree...
>> cd /usr/src; MACHINE_ARCH=powerpc64  MACHINE=powerpc  CPUTYPE= BUILD_TOOLS_META=.NOMETA CC="cc -target powerpc64-unknown-freebsd12.0 --sysroot=/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" CXX="c++  -target powerpc64-unknown-freebsd12.0 --sysroot=/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp -B/usr/local/powerpc64-unknown-freebsd12.0/bin/"  CPP="cpp -target powerpc64-unknown-freebsd12.0 --sysroot=/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp -B/usr/local/powerpc64-unknown-freebsd12.0/bin/"  AS="/usr/local/powerpc64-unknown-freebsd12.0/bin/as" AR="/usr/local/powerpc64-unknown-freebsd12.0/bin/ar" LD="/usr/local/powerpc64-unknown-freebsd12.0/bin/ld" LLVM_LINK=""  NM=/usr/local/powerpc64-unknown-freebsd12.0/bin/nm OBJCOPY="/usr/local/powerpc64-unknown-freebsd12.0/bin/objcopy"  RANLIB=/usr/local/powerpc64-unknown-
>>  freebsd12.0/bin/ranlib STRINGS=/usr/local/bin/powerpc64-unknown-freebsd12.0-strings  SIZE="/usr/local/powerpc64-unknown-freebsd12.0/bin/size"  INSTALL="sh /usr/src/tools/install.sh"  PATH=/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/legacy/usr/sbin:/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/legacy/usr/bin:/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/legacy/bin:/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/usr/sbin:/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin  SYSROOT=/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp make  -f Makefile.inc1  BWPHASE=worldtmp  DESTDIR=/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp -DBATCH_DELETE_OLD_FILES  delete-old d
>>  elete-old-libs >/dev/null
>> 
>> load: 0.68  cmd: make 62180 [select] 25.15r 0.00u 0.00s 0% 1468k
>> make: Working in: /usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64
>> packet_write_wait: Connection to 192.168.1.165 port 22: Broken pipe
>> 
>> 
>> (I noticed the long pause and got the ^T in before the panic.)
>> 
>> Yet again it is xargs related fork activity that gets the problem (from core.txt.1 ):
>> 
>>   561 Thread 100836 (PID=69982: xargs)  fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:840
>> . . .
>> * 559 Thread 100811 (PID=62304: xargs)  doadump (textdump=-2122191464) at pcpu.h:230
>> 
>> spin lock 0xffffffff81b3cf00 (sched lock 24) held by 0xfffff806aa6d5000 (tid 100836) too long
>> panic: spin lock held too long
>> cpuid = 24
>> time = 1518974055
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f11304d0
>> vpanic() at vpanic+0x18d/frame 0xfffffe00f1130530
>> panic() at panic+0x43/frame 0xfffffe00f1130590
>> _mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x71/frame 0xfffffe00f11305a0
>> thread_lock_flags_() at thread_lock_flags_+0xdb/frame 0xfffffe00f1130610
>> statclock_cnt() at statclock_cnt+0xdc/frame 0xfffffe00f1130650
>> handleevents() at handleevents+0x113/frame 0xfffffe00f11306a0
>> timercb() at timercb+0xa9/frame 0xfffffe00f11306f0
>> lapic_handle_timer() at lapic_handle_timer+0xa7/frame 0xfffffe00f1130730
>> timerint_u() at timerint_u+0x96/frame 0xfffffe00f1130810
>> thread_lock_flags_() at thread_lock_flags_+0xc1/frame 0xfffffe00f1130880
>> fork1() at fork1+0x1b9f/frame 0xfffffe00f1130930
>> sys_vfork() at sys_vfork+0x4c/frame 0xfffffe00f1130980
>> amd64_syscall() at amd64_syscall+0xa48/frame 0xfffffe00f1130ab0
>> fast_syscall_common() at fast_syscall_common+0x101/frame 0x7fffffffc5a0
>> 
> 


===
Mark Millard
marklmi at yahoo.com
( markmi at dsl-only.net is
going away in 2018-Feb, late)
Received on Sun Feb 18 2018 - 19:08:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC