Re: dblfault panic r277611

From: Dmitry Chagin <dchagin_at_freebsd.org>
Date: Sun, 25 Jan 2015 17:15:59 +0300
2015-01-25 15:06 GMT+03:00 Konstantin Belousov <kostikbel_at_gmail.com>:

> On Sat, Jan 24, 2015 at 10:42:45PM +0300, Chagin Dmitry wrote:
> > On Sat, Jan 24, 2015 at 12:35:19PM +0200, Konstantin Belousov wrote:
> > > On Sat, Jan 24, 2015 at 12:46:33PM +0300, Chagin Dmitry wrote:
> > > > Hi,
> > > >
> > > >
> > > > dchagin.static.corbina.net dumped core - see /var/crash/vmcore.7
> > > >
> > > > Sat Jan 24 01:02:20 MSK 2015
> > > >
> > > > FreeBSD dchagin.static.corbina.net 11.0-CURRENT FreeBSD
> 11.0-CURRENT #2 r277611+c41ef74(lemul): Sat Jan 24 00:53:45 MSK 2015
>  root_at_dchagin.static.corbina.net:/home/rootobj/home/git/head/sys/YOY
> amd64
> > > >
> > > > panic: double fault
> > > >
> > > > GNU gdb 6.1.1 [FreeBSD]
> > > > Copyright 2004 Free Software Foundation, Inc.
> > > > GDB is free software, covered by the GNU General Public License, and
> you are
> > > > welcome to change it and/or distribute copies of it under certain
> conditions.
> > > > Type "show copying" to see the conditions.
> > > > There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> > > > This GDB was configured as "amd64-marcel-freebsd"...
> > > >
> > > > Unread portion of the kernel message buffer:
> > > > ffs_syncvnode+0x3b0/frame 0xfffffe033c22bd50
> > > > ffs_truncate() at ffs_truncate+0xc6a/frame 0xfffffe033c22c150
> > > > ufs_direnter() at ufs_direnter+0xde5/frame 0xfffffe033c22c280
> > > > ufs_mkdir() at ufs_mkdir+0xb07/frame 0xfffffe033c22c4a0
> > > >
> > > > Fatal double fault
> > > > rip = 0xffffffff807a8d03
> > > > rsp = 0xfffffe033c228e60
> > > > rbp = 0xfffffe033c229000
> > > > cpuid = 5; apic id = 05
> > > > panic: double fault
> > > > cpuid = 5
> > > > KDB: enter: panic
> > > >
> > > > Reading symbols from /boot/kernel/if_tap.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/if_tap.ko.symbols
> > > > Reading symbols from /boot/kernel/if_bridge.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/if_bridge.ko.symbols
> > > > Reading symbols from /boot/kernel/bridgestp.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/bridgestp.ko.symbols
> > > > Reading symbols from /boot/kernel/usb.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/usb.ko.symbols
> > > > Reading symbols from /boot/kernel/xhci.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/xhci.ko.symbols
> > > > Reading symbols from /boot/kernel/vmm.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/vmm.ko.symbols
> > > > Reading symbols from /boot/kernel/nmdm.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/nmdm.ko.symbols
> > > > Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/linprocfs.ko.symbols
> > > > Reading symbols from /boot/kernel/pseudofs.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/pseudofs.ko.symbols
> > > > Reading symbols from /boot/kernel/linux_common.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/linux_common.ko.symbols
> > > > Reading symbols from /boot/kernel/procfs.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/procfs.ko.symbols
> > > > Reading symbols from /boot/kernel/ukbd.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/ukbd.ko.symbols
> > > > Reading symbols from /boot/kernel/ums.ko.symbols...done.
> > > > Loaded symbols for /boot/kernel/ums.ko.symbols
> > > > #0  doadump (textdump=771179792)
> > > >     at /home/git/head/sys/kern/kern_shutdown.c:262
> > > > 262               dumptid = curthread->td_tid;
> > > > (kgdb) #0  doadump (textdump=771179792)
> > > >     at /home/git/head/sys/kern/kern_shutdown.c:262
> > > > #1  0xffffffff803c1b08 in db_fncall_generic (addr=-2139713712,
> > > >     rv=0xfffffe032df744a0, nargs=0, args=0xfffffe032df744b0)
> > > >     at /home/git/head/sys/ddb/db_command.c:568
> > > > #2  0xffffffff803c17d3 in db_fncall (dummy1=-2185367173664, dummy2=0,
> > > >     dummy3=0, dummy4=0xfffffe032df745e0 "\n")
> > > >     at /home/git/head/sys/ddb/db_command.c:616
> > > > #3  0xffffffff803c142b in db_command (last_cmdp=0xffffffff810fd6a8,
> > > >     cmd_table=0x0, dopager=1) at
> /home/git/head/sys/ddb/db_command.c:440
> > > > #4  0xffffffff803c0f9d in db_command_loop ()
> > > >     at /home/git/head/sys/ddb/db_command.c:493
> > > > #5  0xffffffff803c58d9 in db_trap (type=3, code=0)
> > > >     at /home/git/head/sys/ddb/db_main.c:251
> > > > #6  0xffffffff807cc704 in kdb_trap (type=3, code=0,
> tf=0xfffffe032df74cc0)
> > > >     at /home/git/head/sys/kern/subr_kdb.c:654
> > > > #7  0xffffffff80c94e1d in trap (frame=0xfffffe032df74cc0)
> > > >     at /home/git/head/sys/amd64/amd64/trap.c:546
> > > > #8  0xffffffff80c9655f in trap_check (frame=0xfffffe032df74cc0)
> > > >     at /home/git/head/sys/amd64/amd64/trap.c:645
> > > > #9  0xffffffff80c691a2 in calltrap ()
> > > >     at /home/git/head/sys/amd64/amd64/exception.S:235
> > > > #10 0xffffffff807cbf15 in breakpoint () at cpufunc.h:63
> > > > #11 0xffffffff807cbaff in kdb_enter (why=0xffffffff80dcd635 "panic",
> > > >     msg=0xffffffff80dcd635 "panic") at
> /home/git/head/sys/kern/subr_kdb.c:443
> > > > #12 0xffffffff80769768 in vpanic (fmt=0xffffffff80e24597 "double
> fault",
> > > >     ap=0xfffffe032df74ec0) at
> /home/git/head/sys/kern/kern_shutdown.c:740
> > > > #13 0xffffffff80769820 in panic (fmt=0xffffffff80e24597 "double
> fault")
> > > >     at /home/git/head/sys/kern/kern_shutdown.c:676
> > > > #14 0xffffffff80c9667d in dblfault_handler (frame=0xfffffe032df74f40)
> > > >     at /home/git/head/sys/amd64/amd64/trap.c:912
> > > > #15 0xffffffff80c6929c in Xdblfault ()
> > > >     at /home/git/head/sys/amd64/amd64/exception.S:291
> > > > #16 0xffffffff807a8d03 in cpu_search_lowest (cg=Cannot access memory
> at address 0xfffffe033c228ec8
> > > > )
> > > >     at /home/git/head/sys/kern/sched_ule.c:764
> > > > #17 0xffffffff807a9094 in cpu_search_lowest (cg=0xffffffff8128a6e8,
> > > >     low=0xfffffe033c2292f8) at
> /home/git/head/sys/kern/sched_ule.c:690
> > > > #18 0xffffffff807a9094 in cpu_search_lowest (cg=0xffffffff8128a6b0,
> > > >     low=0xfffffe033c229380) at
> /home/git/head/sys/kern/sched_ule.c:690
> > > > #19 0xffffffff807b0f56 in sched_lowest (cg=0xffffffff8128a6b0, mask=
> > > >         {__bits = {255, 0, 0, 0}}, pri=121, maxload=2147483647,
> prefer=5)
> > > >     at /home/git/head/sys/kern/sched_ule.c:796
> > > > #20 0xffffffff807abcdb in sched_pickcpu (td=0xfffff80009e5a9a0,
> flags=0)
> > > >     at /home/git/head/sys/kern/sched_ule.c:1276
> > > > #21 0xffffffff807ace35 in sched_add (td=0xfffff80009e5a9a0, flags=0)
> > > >     at /home/git/head/sys/kern/sched_ule.c:2395
> > > > #22 0xffffffff807acac9 in sched_wakeup (td=0xfffff80009e5a9a0)
> > > >     at /home/git/head/sys/kern/sched_ule.c:2029
> > > > #23 0xffffffff8077d6a8 in setrunnable (td=0xfffff80009e5a9a0)
> > > >     at /home/git/head/sys/kern/kern_synch.c:544
> > > > #24 0xffffffff807e4e98 in sleepq_resume_thread
> (sq=0xfffff80009e55d80,
> > > >     td=0xfffff80009e5a9a0, pri=0)
> > > >     at /home/git/head/sys/kern/subr_sleepqueue.c:776
> > > > #25 0xffffffff807e306a in sleepq_timeout (arg=0xfffff80009e5a9a0)
> > > >     at /home/git/head/sys/kern/subr_sleepqueue.c:915
> > > > #26 0xffffffff80791b40 in softclock_call_cc (c=0xfffff80009e5ad38,
> > > >     cc=0xffffffff813a4200, direct=1)
> > > >     at /home/git/head/sys/kern/kern_timeout.c:724
> > > > #27 0xffffffff807913bd in callout_process (now=740683739317)
> > > >     at /home/git/head/sys/kern/kern_timeout.c:499
> > > > #28 0xffffffff80ce346a in handleevents (now=740683739317, fake=0)
> > > >     at /home/git/head/sys/kern/kern_clocksource.c:212
> > > > #29 0xffffffff80ce3fd6 in timercb (et=0xffffffff8137df68, arg=0x0)
> > > >     at /home/git/head/sys/kern/kern_clocksource.c:345
> > > > #30 0xffffffff80d376e3 in lapic_handle_timer
> (frame=0xfffffe033c229c50)
> > > >     at /home/git/head/sys/x86/x86/local_apic.c:883
> > > > #31 0xffffffff80c69cfc in Xtimerint () at apic_vector.S:109
> > > > #32 0xffffffff80c745ef in write_rflags (rf=642) at cpufunc.h:382
> > > > #33 0xffffffff80c6f225 in intr_restore (rflags=642) at cpufunc.h:775
> > > > #34 0xffffffff80c71ce8 in spinlock_exit ()
> > > >     at /home/git/head/sys/amd64/amd64/machdep.c:2177
> > > > #35 0xffffffff8074335c in __mtx_unlock_spin_flags
> (c=0xffffffff8119ec80,
> > > >     opts=0, file=0xffffffff80dc3d2b
> "/home/git/head/sys/kern/kern_cons.c",
> > > >     line=530) at /home/git/head/sys/kern/kern_mutex.c:305
> > > > #36 0xffffffff806df9fc in cnputs (p=0xfffffe033c22a402 "\"<\003ЧЪЪ")
> > > >     at /home/git/head/sys/kern/kern_cons.c:530
> > > > #37 0xffffffff807d76ae in putbuf (c=10, ap=0xfffffe033c22a3b8)
> > > >     at /home/git/head/sys/kern/subr_prf.c:427
> > > > #38 0xffffffff807d60d6 in putchar (c=10, arg=0xfffffe033c22a3b8)
> > > >     at /home/git/head/sys/kern/subr_prf.c:471
> > > > #39 0xffffffff807d43e3 in kvprintf (fmt=0xffffffff80d77b33 "",
> > > >     func=0xffffffff807d6010 <putchar>, arg=0xfffffe033c22a3b8,
> radix=10,
> > > >     ap=0xfffffe033c22a510) at /home/git/head/sys/kern/subr_prf.c:720
> > > > #40 0xffffffff807d6569 in _vprintf (level=-1, flags=5,
> > > >     fmt=0xffffffff80d77b31 "%c", ap=0xfffffe033c22a510)
> > > >     at /home/git/head/sys/kern/subr_prf.c:271
> > > > #41 0xffffffff807d68dd in vprintf (fmt=0xffffffff80d77b31 "%c",
> > > >     ap=0xfffffe033c22a510) at /home/git/head/sys/kern/subr_prf.c:388
> > > > #42 0xffffffff807d689b in printf (fmt=0xffffffff80d77b31 "%c")
> > > >     at /home/git/head/sys/kern/subr_prf.c:377
> > > > #43 0xffffffff803c5d55 in db_putc (c=10)
> > > >     at /home/git/head/sys/ddb/db_output.c:156
> > > > #44 0xffffffff803c5b21 in db_putchar (c=10, arg=0xfffffe033c22aad8)
> > > >     at /home/git/head/sys/ddb/db_output.c:128
> > > > #45 0xffffffff807d3b65 in kvprintf (fmt=0xffffffff80d8090f "",
> > > >     func=0xffffffff803c5af0 <db_putchar>, arg=0xfffffe033c22aad8,
> radix=16,
> > > >     ap=0xfffffe033c22aac0) at /home/git/head/sys/kern/subr_prf.c:645
> > > > #46 0xffffffff803c5ad8 in db_printf (fmt=0xffffffff80d8090e "\n")
> > > >     at /home/git/head/sys/ddb/db_output.c:340
> > > > #47 0xffffffff80c67f73 in db_print_stack_entry (
> > > >     name=0xffffffff815c8262 "ufs_mkdir", narg=0, argnp=0x0,
> > > >     argp=0xfffffe033c22c4b0, callpc=18446744071574694567,
> > > >     frame=0xfffffe033c22c4a0) at
> /home/git/head/sys/amd64/amd64/db_trace.c:260
> > > > #48 0xffffffff80c66f3b in db_backtrace (td=0xfffff801ad926000,
> tf=0x0,
> > > >     frame=0xfffffe033c22c4a0, pc=18446744071574694567, count=1005)
> > > >     at /home/git/head/sys/amd64/amd64/db_trace.c:462
> > > > #49 0xffffffff80c66bdf in db_trace_self ()
> > > >     at /home/git/head/sys/amd64/amd64/db_trace.c:498
> > > > #50 0xffffffff803c568e in db_trace_self_wrapper ()
> > > >     at /home/git/head/sys/ddb/db_main.c:268
> > > > #51 0xffffffff807cbcd8 in kdb_backtrace ()
> > > >     at /home/git/head/sys/kern/subr_kdb.c:370
> > > > #52 0xffffffff807fe924 in _witness_debugger (cond=1,
> > > >     msg=0xffffffff80dd6e29 "witness_checkorder")
> > > >     at /home/git/head/sys/kern/subr_witness.c:2904
> > > > #53 0xffffffff807fe2de in witness_checkorder
> (lock=0xfffff80193effd50,
> > > >     flags=9, file=0xffffffff80ddfb99
> "/home/git/head/sys/kern/vfs_subr.c",
> > > >     line=2164, interlock=0xfffff80193effd80)
> > > >     at /home/git/head/sys/kern/subr_witness.c:1365
> > > > #54 0xffffffff80730d65 in __lockmgr_args (lk=0xfffff80193effd50,
> > > >     flags=524544, ilk=0xfffff80193effd80, wmesg=0x0, pri=0, timo=0,
> > > >     file=0xffffffff80ddfb99 "/home/git/head/sys/kern/vfs_subr.c",
> line=2164)
> > > >     at /home/git/head/sys/kern/kern_lock.c:756
> > > > #55 0xffffffff80bf1438 in _lockmgr_args (lk=0xfffff80193effd50,
> flags=524544,
> > > >     ilk=0xfffff80193effd80, wmesg=0x0, prio=0, timo=0,
> > > >     file=0xffffffff80ddfb99 "/home/git/head/sys/kern/vfs_subr.c",
> line=2164)
> > > >     at lockmgr.h:98
> > > > #56 0xffffffff80bef677 in ffs_lock (ap=0xfffffe033c22b7c8)
> > > >     at /home/git/head/sys/ufs/ffs/ffs_vnops.c:385
> > > > #57 0xffffffff80d47cd4 in VOP_LOCK1_APV (vop=0xffffffff810cd328,
> > > >     a=0xfffffe033c22b7c8) at vnode_if.c:2082
> > > > #58 0xffffffff808ac2f3 in VOP_LOCK1 (vp=0xfffff80193effce8,
> flags=524544,
> > > >     file=0xffffffff80ddfb99 "/home/git/head/sys/kern/vfs_subr.c",
> line=2164)
> > > >     at vnode_if.h:859
> > > > #59 0xffffffff808aa122 in _vn_lock (vp=0xfffff80193effce8,
> flags=524544,
> > > >     file=0xffffffff80ddfb99 "/home/git/head/sys/kern/vfs_subr.c",
> line=2164)
> > > >     at /home/git/head/sys/kern/vfs_vnops.c:1531
> > > > #60 0xffffffff8088d636 in vget (vp=0xfffff80193effce8, flags=524544,
> > > >     td=0xfffff801ad926000) at /home/git/head/sys/kern/vfs_subr.c:2164
> > > > #61 0xffffffff8087884f in vfs_hash_get (mp=0xfffff80009db2000,
> hash=71269052,
> > > >     flags=524288, td=0xfffff801ad926000, vpp=0xfffffe033c22bb40,
> fn=0,
> > > >     arg=0x0) at /home/git/head/sys/kern/vfs_hash.c:89
> > > > #62 0xffffffff80be7969 in ffs_vgetf (mp=0xfffff80009db2000,
> ino=71269052,
> > > >     flags=524288, vpp=0xfffffe033c22bb40, ffs_flags=1)
> > > >     at /home/git/head/sys/ufs/ffs/ffs_vfsops.c:1636
> > > > #63 0xffffffff80bd1d02 in flush_pagedep_deps (pvp=0xfffff80193c8d588,
> > > >     mp=0xfffff80009db2000, diraddhdp=0xfffff80193769b58)
> > > >     at /home/git/head/sys/ufs/ffs/ffs_softdep.c:12929
> > > > #64 0xffffffff80bd182c in softdep_sync_buf (vp=0xfffff80193c8d588,
> > > >     bp=0xfffffe02d6a8d6d0, waitfor=1)
> > > >     at /home/git/head/sys/ufs/ffs/ffs_softdep.c:12621
> > > > #65 0xffffffff80bf0d40 in ffs_syncvnode (vp=0xfffff80193c8d588,
> waitfor=1,
> > > >     flags=0) at /home/git/head/sys/ufs/ffs/ffs_vnops.c:280
> > > > #66 0xffffffff80babd9a in ffs_truncate (vp=0xfffff80193c8d588,
> length=512,
> > > >     flags=2176, cred=0xfffff80009c52b00)
> > > >     at /home/git/head/sys/ufs/ffs/ffs_inode.c:339
> > > > #67 0xffffffff80bfd315 in ufs_direnter (dvp=0xfffff80193c8d588,
> > > >     tvp=0xfffff80193effce8, dirp=0xfffffe033c22c390,
> cnp=0xfffffe033c22c720,
> > > >     newdirbp=0xfffffe02d66d7db0, isrename=0)
> > > >     at /home/git/head/sys/ufs/ufs/ufs_lookup.c:1133
> > > > #68 0xffffffff80c0aaa7 in ufs_mkdir (ap=0xfffffe033c22c558)
> > > >     at /home/git/head/sys/ufs/ufs/ufs_vnops.c:1963
> > > > #69 0xffffffff80d460fd in VOP_MKDIR_APV (vop=0xffffffff810cddd8,
> > > >     a=0xfffffe033c22c558) at vnode_if.c:1607
> > > > #70 0xffffffff808a5979 in VOP_MKDIR (dvp=0xfffff80193c8d588,
> > > >     vpp=0xfffffe033c22c6f8, cnp=0xfffffe033c22c720,
> vap=0xfffffe033c22c768)
> > > >     at vnode_if.h:665
> > > > #71 0xffffffff808a585c in kern_mkdirat (td=0xfffff801ad926000,
> fd=-100,
> > > >     path=0x7fffffffe949 <Address 0x7fffffffe949 out of bounds>,
> > > >     segflg=UIO_USERSPACE, mode=511)
> > > >     at /home/git/head/sys/kern/vfs_syscalls.c:3747
> > > > #72 0xffffffff808a54c3 in sys_mkdir (td=0xfffff801ad926000,
> > > >     uap=0xfffffe033c22ca58) at
> /home/git/head/sys/kern/vfs_syscalls.c:3678
> > > > #73 0xffffffff80c97044 in syscallenter (td=0xfffff801ad926000,
> > > >     sa=0xfffffe033c22ca48) at subr_syscall.c:133
> > > > #74 0xffffffff80c9694a in amd64_syscall (td=0xfffff801ad926000,
> traced=0)
> > > >     at /home/git/head/sys/amd64/amd64/trap.c:986
> > > > #75 0xffffffff80c6948b in Xfast_syscall ()
> > > >     at /home/git/head/sys/amd64/amd64/exception.S:395
> > > > #76 0x0000000800946eca in ?? ()
> > > > Previous frame inner to this frame (corrupt stack?)
> > > > Current language:  auto; currently minimal
> > > > (kgdb)
> > > >
> > > >
> > >
> > > This is fun, for some definition of it.
> > >
> > > The process was in the guts of VFS from mkdir(2) syscall, witness
> > > triggered printing of the warning for dreaded buf->hashdir->buf
> non-real
> > > LOR. From the ddb stack backtrace routine, when cnputs released the
> > > console spinlock yet another time, timer interrupt fired and started
> > > proceeding callouts. One of the callout triggered and needs to wake
> > > up a thread sleeping with timeout. There, inside the scheduler,
> > > cpu_search_lowest() was called, recursed twice and finally
> > > overflown the stack.
> > >
> > > Is this yet another clang regression ?  The cpu_search_lowest() saga
> seems
> > > to never end. r268211 is uneffective, probably after clang 3.5 import.
> >
> > yes, you are right. building kernel without SSP fixes the panic.
> >
> How did you ensured that it is fixed ?  There must be very specific
> circumstances: timer interrupt fired while in witness, callout activated,
> etc to trigger the panic.
>
> Or, do you mean that you looked at the assembly for the cpu_search() and
> see that it is no longer recursive ?
>

ouch, sorry. panic was easily reproducible (every buildworld)
I disassembled the kernel:

ffffffff807a9210 <cpu_search_lowest>:
ffffffff807a9210:       55                      push   %rbp
ffffffff807a9211:       48 89 e5                mov    %rsp,%rbp
ffffffff807a9214:       48 81 ec a0 01 00 00    sub    $0x1a0,%rsp
ffffffff807a921b:       48 8b 04 25 f0 d4 29    mov
 0xffffffff8129d4f0,%rax
ffffffff807a9222:       81
ffffffff807a9223:       48 89 45 f8             mov    %rax,-0x8(%rbp)
the panic was here ^^^^
ffffffff807a9227:       48 89 bd c8 fe ff ff    mov    %rdi,-0x138(%rbp)
ffffffff807a922e:       48 89 b5 c0 fe ff ff    mov    %rsi,-0x140(%rbp)
ffffffff807a9235:       48 8b 85 c8 fe ff ff    mov    -0x138(%rbp),%rax

it seem to me that failed instruction from SSP prologue.
Received on Sun Jan 25 2015 - 13:16:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:55 UTC