Re: panic in deadlkres() on r267110

From: Sean Bruno <sbruno_at_ignoranthack.me>
Date: Fri, 06 Jun 2014 07:23:49 -0700
On Fri, 2014-06-06 at 10:12 -0400, Glen Barber wrote:
> Two machines in the cluster panic last night with the same backtrace.
> It is unclear yet exactly what was happening on the systems, but both
> are port building machines using ports-mgmt/tinderbox.
> 
> Any ideas or information on how to further debug this would be
> appreciated.
> 
These machines were happily running r266621 previously to this update
yesterday.  So, that gives us a bisection point.

sean


> Script started on Fri Jun  6 14:01:53 2014
> command: /bin/sh
> # uname -a
> FreeBSD redbuild04.nyi.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r267110: Thu Jun  5 15:57:43 UTC 2014     sbruno_at_redbuild04.nyi.freebsd.org:/usr/obj/usr/src/sys/REDBUILD  amd64
> # kgdb ./kernel.debug /var/crash/vmcore.0
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> panic: deadlkres: possible deadlock detected on allproc_lock
> 
> cpuid = 17
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1838702a20
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe1838702ad0
> panic() at panic+0x155/frame 0xfffffe1838702b50
> deadlkres() at deadlkres+0x42a/frame 0xfffffe1838702bb0
> fork_exit() at fork_exit+0x9a/frame 0xfffffe1838702bf0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1838702bf0
> --- trap 0, rip = 0, rsp = 0xfffffe1838702cb0, rbp = 0 ---
> KDB: enter: panic
> 
> Reading symbols from /boot/kernel/zfs.ko.symbols...done.
> Loaded symbols for /boot/kernel/zfs.ko.symbols
> Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
> Loaded symbols for /boot/kernel/opensolaris.ko.symbols
> Reading symbols from /boot/kernel/ums.ko.symbols...done.
> Loaded symbols for /boot/kernel/ums.ko.symbols
> Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
> Loaded symbols for /boot/kernel/linprocfs.ko.symbols
> Reading symbols from /boot/kernel/linux.ko.symbols...done.
> Loaded symbols for /boot/kernel/linux.ko.symbols
> #0  doadump (textdump=-946873840) at pcpu.h:219
> 219             __asm("movq %%gs:%1,%0" : "=r" (td)
> (kgdb) bt
> #0  doadump (textdump=-946873840) at pcpu.h:219
> #1  0xffffffff8034e865 in db_fncall (dummy1=<value optimized out>, 
>     dummy2=<value optimized out>, dummy3=<value optimized out>, 
>     dummy4=<value optimized out>) at /usr/src/sys/ddb/db_command.c:578
> #2  0xffffffff8034e54d in db_command (cmd_table=0x0)
>     at /usr/src/sys/ddb/db_command.c:449
> #3  0xffffffff8034e2c4 in db_command_loop ()
>     at /usr/src/sys/ddb/db_command.c:502
> #4  0xffffffff80350d20 in db_trap (type=<value optimized out>, code=0)
>     at /usr/src/sys/ddb/db_main.c:231
> #5  0xffffffff809a9bd9 in kdb_trap (type=3, code=0, tf=<value optimized out>)
>     at /usr/src/sys/kern/subr_kdb.c:656
> #6  0xffffffff80dc00e3 in trap (frame=0xfffffe1838702a00)
>     at /usr/src/sys/amd64/amd64/trap.c:551
> #7  0xffffffff80da29c2 in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff809a933e in kdb_enter (why=0xffffffff81039a72 "panic", 
>     msg=<value optimized out>) at cpufunc.h:63
> #9  0xffffffff8096a8b5 in panic (fmt=<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:749
> #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203
> #11 0xffffffff8093170a in fork_exit (callout=0xffffffff8090cd40 <deadlkres>, 
>     arg=0x0, frame=0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c:977
> ---Type <return> to continue, or q <return> to quit---
> #12 0xffffffff80da2efe in fork_trampoline ()
>     at /usr/src/sys/amd64/amd64/exception.S:605
> #13 0x0000000000000000 in ?? ()
> Current language:  auto; currently minimal
> (kgdb) fr 10
> #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203
> 203                     panic("%s: possible deadlock detected on allproc_lock\n",
> (kgdb) l
> 198                      * priority inversion problem leading to starvation.
> 199                      * If the lock can't be held after 100 tries, panic.
> 200                      */
> 201                     if (!sx_try_slock(&allproc_lock)) {
> 202                             if (tryl > 100)
> 203                     panic("%s: possible deadlock detected on allproc_lock\n",
> 204                                         __func__);
> 205                             tryl++;
> 206                             pause("allproc", sleepfreq * hz);
> 207                             continue;
> (kgdb) up
> #11 0xffffffff8093170a in fork_exit (callout=0xffffffff8090cd40 <deadlkres>, 
>     arg=0x0, frame=0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c:977
> 977             callout(arg, frame);
> (kgdb) l
> 972              * cpu_set_fork_handler intercepts this function call to
> 973              * have this call a non-return function to stay in kernel mode.
> 974              * initproc has its own fork handler, but it does return.
> 975              */
> 976             KASSERT(callout != NULL, ("NULL callout in fork_exit"));
> 977             callout(arg, frame);
> 978     
> 979             /*
> 980              * Check if a kernel thread misbehaved and returned from its main
> 981              * function.
> (kgdb) list *0xffffffff8090cd40
> 0xffffffff8090cd40 is in deadlkres (/usr/src/sys/kern/kern_clock.c:185).
> 180     static int blktime_threshold = 900;
> 181     static int sleepfreq = 3;
> 182     
> 183     static void
> 184     deadlkres(void)
> 185     {
> 186             struct proc *p;
> 187             struct thread *td;
> 188             void *wchan;
> 189             int blkticks, i, slpticks, slptype, tryl, tticks;
> (kgdb) quit
> # ^D
> Script done on Fri Jun  6 14:03:30 2014
> 
> Thanks.
> 
> Glen
> 
Received on Fri Jun 06 2014 - 12:23:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC