On Fri, 2014-06-06 at 10:12 -0400, Glen Barber wrote: > Two machines in the cluster panic last night with the same backtrace. > It is unclear yet exactly what was happening on the systems, but both > are port building machines using ports-mgmt/tinderbox. > > Any ideas or information on how to further debug this would be > appreciated. > These machines were happily running r266621 previously to this update yesterday. So, that gives us a bisection point. sean > Script started on Fri Jun 6 14:01:53 2014 > command: /bin/sh > # uname -a > FreeBSD redbuild04.nyi.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r267110: Thu Jun 5 15:57:43 UTC 2014 sbruno_at_redbuild04.nyi.freebsd.org:/usr/obj/usr/src/sys/REDBUILD amd64 > # kgdb ./kernel.debug /var/crash/vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > panic: deadlkres: possible deadlock detected on allproc_lock > > cpuid = 17 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1838702a20 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe1838702ad0 > panic() at panic+0x155/frame 0xfffffe1838702b50 > deadlkres() at deadlkres+0x42a/frame 0xfffffe1838702bb0 > fork_exit() at fork_exit+0x9a/frame 0xfffffe1838702bf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1838702bf0 > --- trap 0, rip = 0, rsp = 0xfffffe1838702cb0, rbp = 0 --- > KDB: enter: panic > > Reading symbols from /boot/kernel/zfs.ko.symbols...done. > Loaded symbols for /boot/kernel/zfs.ko.symbols > Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > Loaded symbols for /boot/kernel/opensolaris.ko.symbols > Reading symbols from /boot/kernel/ums.ko.symbols...done. > Loaded symbols for /boot/kernel/ums.ko.symbols > Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. > Loaded symbols for /boot/kernel/linprocfs.ko.symbols > Reading symbols from /boot/kernel/linux.ko.symbols...done. > Loaded symbols for /boot/kernel/linux.ko.symbols > #0 doadump (textdump=-946873840) at pcpu.h:219 > 219 __asm("movq %%gs:%1,%0" : "=r" (td) > (kgdb) bt > #0 doadump (textdump=-946873840) at pcpu.h:219 > #1 0xffffffff8034e865 in db_fncall (dummy1=<value optimized out>, > dummy2=<value optimized out>, dummy3=<value optimized out>, > dummy4=<value optimized out>) at /usr/src/sys/ddb/db_command.c:578 > #2 0xffffffff8034e54d in db_command (cmd_table=0x0) > at /usr/src/sys/ddb/db_command.c:449 > #3 0xffffffff8034e2c4 in db_command_loop () > at /usr/src/sys/ddb/db_command.c:502 > #4 0xffffffff80350d20 in db_trap (type=<value optimized out>, code=0) > at /usr/src/sys/ddb/db_main.c:231 > #5 0xffffffff809a9bd9 in kdb_trap (type=3, code=0, tf=<value optimized out>) > at /usr/src/sys/kern/subr_kdb.c:656 > #6 0xffffffff80dc00e3 in trap (frame=0xfffffe1838702a00) > at /usr/src/sys/amd64/amd64/trap.c:551 > #7 0xffffffff80da29c2 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:231 > #8 0xffffffff809a933e in kdb_enter (why=0xffffffff81039a72 "panic", > msg=<value optimized out>) at cpufunc.h:63 > #9 0xffffffff8096a8b5 in panic (fmt=<value optimized out>) > at /usr/src/sys/kern/kern_shutdown.c:749 > #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 > #11 0xffffffff8093170a in fork_exit (callout=0xffffffff8090cd40 <deadlkres>, > arg=0x0, frame=0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c:977 > ---Type <return> to continue, or q <return> to quit--- > #12 0xffffffff80da2efe in fork_trampoline () > at /usr/src/sys/amd64/amd64/exception.S:605 > #13 0x0000000000000000 in ?? () > Current language: auto; currently minimal > (kgdb) fr 10 > #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 > 203 panic("%s: possible deadlock detected on allproc_lock\n", > (kgdb) l > 198 * priority inversion problem leading to starvation. > 199 * If the lock can't be held after 100 tries, panic. > 200 */ > 201 if (!sx_try_slock(&allproc_lock)) { > 202 if (tryl > 100) > 203 panic("%s: possible deadlock detected on allproc_lock\n", > 204 __func__); > 205 tryl++; > 206 pause("allproc", sleepfreq * hz); > 207 continue; > (kgdb) up > #11 0xffffffff8093170a in fork_exit (callout=0xffffffff8090cd40 <deadlkres>, > arg=0x0, frame=0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c:977 > 977 callout(arg, frame); > (kgdb) l > 972 * cpu_set_fork_handler intercepts this function call to > 973 * have this call a non-return function to stay in kernel mode. > 974 * initproc has its own fork handler, but it does return. > 975 */ > 976 KASSERT(callout != NULL, ("NULL callout in fork_exit")); > 977 callout(arg, frame); > 978 > 979 /* > 980 * Check if a kernel thread misbehaved and returned from its main > 981 * function. > (kgdb) list *0xffffffff8090cd40 > 0xffffffff8090cd40 is in deadlkres (/usr/src/sys/kern/kern_clock.c:185). > 180 static int blktime_threshold = 900; > 181 static int sleepfreq = 3; > 182 > 183 static void > 184 deadlkres(void) > 185 { > 186 struct proc *p; > 187 struct thread *td; > 188 void *wchan; > 189 int blkticks, i, slpticks, slptype, tryl, tticks; > (kgdb) quit > # ^D > Script done on Fri Jun 6 14:03:30 2014 > > Thanks. > > Glen >Received on Fri Jun 06 2014 - 12:23:52 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC