Re: panic on application core dump?

From: Sean Bruno <sbruno_at_ignoranthack.me>
Date: Sun, 22 Feb 2015 10:46:53 -0800
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 02/22/15 10:04, Konstantin Belousov wrote:
> On Sun, Feb 22, 2015 at 09:34:29AM -0800, Sean Bruno wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
>> 
>>> Err.  Is it easily reproducable in your setup ? The core file
>>> vnode is indeed unreferenced before notification is sent.
>>> 
>>> Try this.
>>> 
>>> diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index 
>>> 41da3dd..57f66b0 100644
>> 
>> Restarted my non-deterministic test case.  Three instances of
>> qemu core dumped and the system did *not* panic.
>> 
>> However, this appears to be interfering with signal handling and 
>> reaping.  Applications seems to stall out and become 
>> unkillable/unreapable.  I have to reboot the system via
>> panic/reset.
>> 
> What applications ?  What is the (kernel) backtrace for the ?
> 

Most of the waiting applications are shells scripts:

root    40727    0.0  0.0 17180 5276  1  I+    9:20AM    0:00.18
/bin/sh /usr/local/share/poudriere/bulk.sh -j 11mips64 -p 11mips32 -ac

db> trace 47027
Tracing pid 47027 tid 100835 td 0xfffff80018cae4a0
sched_switch() at sched_switch+0x326/frame 0xfffffe046a1446f0
mi_switch() at mi_switch+0xde/frame 0xfffffe046a144730
sleepq_catch_signals() at sleepq_catch_signals+0xab/frame
0xfffffe046a1447b0
sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfffffe046a1447e0
_cv_wait_sig() at _cv_wait_sig+0x1b0/frame 0xfffffe046a144830
seltdwait() at seltdwait+0x104/frame 0xfffffe046a144880
kern_select() at kern_select+0x963/frame 0xfffffe046a144a90
sys_select() at sys_select+0x54/frame 0xfffffe046a144ad0
amd64_syscall() at amd64_syscall+0x3e7/frame 0xfffffe046a144bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe046a144bf0

>> 1288 root          1  20    0 17180K  5336K zfs    14   1:13
>> 0.00% sh
> Is this the problem ^^^^ ?
> 
> If yes, do you have ddb compiled in ?  Show the output of 'show
> lockedvnods' and 'show alllocks'.

Hmm ... looks unrelated to signals (maybe).  This looks like a common
ZFS deadlock that is yet undiagnosed.  I do not have a show alllocks
command available in db> .  I will show each lock information below:

db> show lockedvnods
Locked vnodes

0xfffff801141a6588: tag zfs, type VDIR
    usecount 19, writecount 0, refcount 20 mountedhere 0
    flags (VV_ROOT|VI_ACTIVE)
    v_object 0xfffff80079be4500 ref 0 pages 0 cleanbuf 0 dirtybuf 0
    lock type zfs: EXCL by thread 0xfffff801ca10c4a0 (pid 75907, sh,
tid 101262)
 with exclusive waiters pending

0xfffff800184c8b10: tag zfs, type VDIR
    usecount 1, writecount 0, refcount 3 mountedhere 0
    flags (VI_ACTIVE)
    v_object 0xfffff80355409300 ref 0 pages 0 cleanbuf 0 dirtybuf 0
    lock type zfs: EXCL by thread 0xfffff8001863d000 (pid 94699, rm,
tid 100930)
 with exclusive waiters pending

0xfffff80404d47b10: tag zfs, type VDIR
    usecount 1, writecount 0, refcount 4 mountedhere 0
    flags (VI_ACTIVE)
    lock type zfs: EXCL by thread 0xfffff80013b29000 (pid 94698, rm,
tid 100772)
 with exclusive waiters pending

0xfffff802ec2b5000: tag zfs, type VDIR
    usecount 3, writecount 0, refcount 4 mountedhere 0
    flags (VI_ACTIVE)
    lock type zfs: EXCL by thread 0xfffff801ca106940 (pid 94700, mv,
tid 101021)
 with exclusive waiters pending

db> show lock 0xfffff801141a6588
 class: sleep mutex
 name: zfs
 flags: {DEF, DUPOK}
 state: {OWNED}
KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe0456b0e050
kdb_reenter() at kdb_reenter+0x33/frame 0xfffffe0456b0e060
trap() at trap+0x54/frame 0xfffffe0456b0e270
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e270
- --- trap 0xc, rip = 0xffffffff809b2c37, rsp = 0xfffffe0456b0e330, rbp
= 0xfffffe0456b0e350 ---
db_show_mtx() at db_show_mtx+0x127/frame 0xfffffe0456b0e350
db_command() at db_command+0x27c/frame 0xfffffe0456b0e410
db_command_loop() at db_command_loop+0x64/frame 0xfffffe0456b0e420
db_trap() at db_trap+0xe0/frame 0xfffffe0456b0e4b0
kdb_trap() at kdb_trap+0x18e/frame 0xfffffe0456b0e540
trap() at trap+0x447/frame 0xfffffe0456b0e750
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e750
- --- trap 0x3, rip = 0xffffffff80a0ee87, rsp = 0xfffffe0456b0e810, rbp
= 0xfffffe0456b0e840 ---
kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame
0xfffffe0456b0e840
kdb_alt_break() at kdb_alt_break+0xb/frame 0xfffffe0456b0e850
uart_intr_rxready() at uart_intr_rxready+0x99/frame 0xfffffe0456b0e880
uart_intr() at uart_intr+0x111/frame 0xfffffe0456b0e8c0
intr_event_handle() at intr_event_handle+0x9b/frame 0xfffffe0456b0e910
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfffffe0456b0e940
lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfffffe0456b0e960
Xapic_isr1() at Xapic_isr1+0xba/frame 0xfffffe0456b0e960
- --- interrupt, rip = 0xffffffff80384913, rsp = 0xfffffe0456b0ea20, rbp
= 0xfffffe0456b0ea70 ---
acpi_cpu_idle() at acpi_cpu_idle+0x2b3/frame 0xfffffe0456b0ea70
cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe0456b0ea90
cpu_idle() at cpu_idle+0x90/frame 0xfffffe0456b0eab0
sched_idletd() at sched_idletd+0x40b/frame 0xfffffe0456b0ebb0
fork_exit() at fork_exit+0x9a/frame 0xfffffe0456b0ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0456b0ebf0
- --- trap 0, rip = 0, rsp = 0xfffffe0456b0ecb0, rbp = 0 ---
db> show lock 0xfffff800184c8b10
 class: sleep mutex
 name: zfs
 flags: {DEF, DUPOK}
 state: {OWNED}
KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe0456b0e050
kdb_reenter() at kdb_reenter+0x33/frame 0xfffffe0456b0e060
trap() at trap+0x54/frame 0xfffffe0456b0e270
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e270
- --- trap 0xc, rip = 0xffffffff809b2c37, rsp = 0xfffffe0456b0e330, rbp
= 0xfffffe0456b0e350 ---
db_show_mtx() at db_show_mtx+0x127/frame 0xfffffe0456b0e350
db_command() at db_command+0x27c/frame 0xfffffe0456b0e410
db_command_loop() at db_command_loop+0x64/frame 0xfffffe0456b0e420
db_trap() at db_trap+0xe0/frame 0xfffffe0456b0e4b0
kdb_trap() at kdb_trap+0x18e/frame 0xfffffe0456b0e540
trap() at trap+0x447/frame 0xfffffe0456b0e750
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e750
- --- trap 0x3, rip = 0xffffffff80a0ee87, rsp = 0xfffffe0456b0e810, rbp
= 0xfffffe0456b0e840 ---
kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame
0xfffffe0456b0e840
kdb_alt_break() at kdb_alt_break+0xb/frame 0xfffffe0456b0e850
uart_intr_rxready() at uart_intr_rxready+0x99/frame 0xfffffe0456b0e880
uart_intr() at uart_intr+0x111/frame 0xfffffe0456b0e8c0
intr_event_handle() at intr_event_handle+0x9b/frame 0xfffffe0456b0e910
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfffffe0456b0e940
lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfffffe0456b0e960
Xapic_isr1() at Xapic_isr1+0xba/frame 0xfffffe0456b0e960
- --- interrupt, rip = 0xffffffff80384913, rsp = 0xfffffe0456b0ea20, rbp
= 0xfffffe0456b0ea70 ---
acpi_cpu_idle() at acpi_cpu_idle+0x2b3/frame 0xfffffe0456b0ea70
cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe0456b0ea90
cpu_idle() at cpu_idle+0x90/frame 0xfffffe0456b0eab0
sched_idletd() at sched_idletd+0x40b/frame 0xfffffe0456b0ebb0
fork_exit() at fork_exit+0x9a/frame 0xfffffe0456b0ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0456b0ebf0
- --- trap 0, rip = 0, rsp = 0xfffffe0456b0ecb0, rbp = 0 ---
db> show lock 0xfffff80404d47b10
 class: sleep mutex
 name: zfs
 flags: {DEF, DUPOK}
 state: {OWNED}
KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe0456b0e050
kdb_reenter() at kdb_reenter+0x33/frame 0xfffffe0456b0e060
trap() at trap+0x54/frame 0xfffffe0456b0e270
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e270
- --- trap 0xc, rip = 0xffffffff809b2c37, rsp = 0xfffffe0456b0e330, rbp
= 0xfffffe0456b0e350 ---
db_show_mtx() at db_show_mtx+0x127/frame 0xfffffe0456b0e350
db_command() at db_command+0x27c/frame 0xfffffe0456b0e410
db_command_loop() at db_command_loop+0x64/frame 0xfffffe0456b0e420
db_trap() at db_trap+0xe0/frame 0xfffffe0456b0e4b0
kdb_trap() at kdb_trap+0x18e/frame 0xfffffe0456b0e540
trap() at trap+0x447/frame 0xfffffe0456b0e750
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e750
- --- trap 0x3, rip = 0xffffffff80a0ee87, rsp = 0xfffffe0456b0e810, rbp
= 0xfffffe0456b0e840 ---
kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame
0xfffffe0456b0e840
kdb_alt_break() at kdb_alt_break+0xb/frame 0xfffffe0456b0e850
uart_intr_rxready() at uart_intr_rxready+0x99/frame 0xfffffe0456b0e880
uart_intr() at uart_intr+0x111/frame 0xfffffe0456b0e8c0
intr_event_handle() at intr_event_handle+0x9b/frame 0xfffffe0456b0e910
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfffffe0456b0e940
lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfffffe0456b0e960
Xapic_isr1() at Xapic_isr1+0xba/frame 0xfffffe0456b0e960
- --- interrupt, rip = 0xffffffff80384913, rsp = 0xfffffe0456b0ea20, rbp
= 0xfffffe0456b0ea70 ---
acpi_cpu_idle() at acpi_cpu_idle+0x2b3/frame 0xfffffe0456b0ea70
cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe0456b0ea90
cpu_idle() at cpu_idle+0x90/frame 0xfffffe0456b0eab0
sched_idletd() at sched_idletd+0x40b/frame 0xfffffe0456b0ebb0
fork_exit() at fork_exit+0x9a/frame 0xfffffe0456b0ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0456b0ebf0
- --- trap 0, rip = 0, rsp = 0xfffffe0456b0ecb0, rbp = 0 ---
db> show lock 0xfffff802ec2b5000
 class: sleep mutex
 name: zfs
 flags: {DEF, DUPOK}
 state: {OWNED}
KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe0456b0e050
kdb_reenter() at kdb_reenter+0x33/frame 0xfffffe0456b0e060
trap() at trap+0x54/frame 0xfffffe0456b0e270
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e270
- --- trap 0xc, rip = 0xffffffff809b2c37, rsp = 0xfffffe0456b0e330, rbp
= 0xfffffe0456b0e350 ---
db_show_mtx() at db_show_mtx+0x127/frame 0xfffffe0456b0e350
db_command() at db_command+0x27c/frame 0xfffffe0456b0e410
db_command_loop() at db_command_loop+0x64/frame 0xfffffe0456b0e420
db_trap() at db_trap+0xe0/frame 0xfffffe0456b0e4b0
kdb_trap() at kdb_trap+0x18e/frame 0xfffffe0456b0e540
trap() at trap+0x447/frame 0xfffffe0456b0e750
calltrap() at calltrap+0x8/frame 0xfffffe0456b0e750
- --- trap 0x3, rip = 0xffffffff80a0ee87, rsp = 0xfffffe0456b0e810, rbp
= 0xfffffe0456b0e840 ---
kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame
0xfffffe0456b0e840
kdb_alt_break() at kdb_alt_break+0xb/frame 0xfffffe0456b0e850
uart_intr_rxready() at uart_intr_rxready+0x99/frame 0xfffffe0456b0e880
uart_intr() at uart_intr+0x111/frame 0xfffffe0456b0e8c0
intr_event_handle() at intr_event_handle+0x9b/frame 0xfffffe0456b0e910
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfffffe0456b0e940
lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfffffe0456b0e960
Xapic_isr1() at Xapic_isr1+0xba/frame 0xfffffe0456b0e960
- --- interrupt, rip = 0xffffffff80384913, rsp = 0xfffffe0456b0ea20, rbp
= 0xfffffe0456b0ea70 ---
acpi_cpu_idle() at acpi_cpu_idle+0x2b3/frame 0xfffffe0456b0ea70
cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe0456b0ea90
cpu_idle() at cpu_idle+0x90/frame 0xfffffe0456b0eab0
sched_idletd() at sched_idletd+0x40b/frame 0xfffffe0456b0ebb0
fork_exit() at fork_exit+0x9a/frame 0xfffffe0456b0ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0456b0ebf0
- --- trap 0, rip = 0, rsp = 0xfffffe0456b0ecb0, rbp = 0 ---


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQF8BAEBCgBmBQJU6iQYXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k/FoH/RmoZu8qhcXYbGgsrWhVerdb
5yuEwokj83FsP/JJU3ABa5cC+xkI06/BAkgn4jQjCJOfLR/1kwh1eRWSSBTcL3qj
kDF3thQN6GP/p2l3UNUCOY3YFMsxMfa5tguAvgPFVBjEKa8rnqOlvYUbMXbo5UoB
2xBP+FOV808fV7a3UuATOn1w6FFg0e9n44BWnj9blGA6d0wKkdwmAd6KgDckyH6R
LPPx9atnm/7xtQ4ZM2NMR4IbPUw8h38Yzb/RQhCPvUR+lbrSzPD3tlD57ATKWBAA
p70JVaMa/tZW9Faad2mDQCCmlxQYQ8oGdDqkuh8qtBCLHTKUiE5N6Tu+wXePR3g=
=gk/L
-----END PGP SIGNATURE-----
Received on Sun Feb 22 2015 - 17:46:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:55 UTC