ddb(4) spoils kernel stack in CURRENT?

From: Dmitry Pryanishnikov <dmitry_at_atlantis.dp.ua>
Date: Tue, 19 Dec 2006 19:37:17 +0200 (EET)
Hello!

   I'm facing with a strange debugging problem under the fresh (19-Dec-2006)
CURRENT on the uniprocessor i386 machine. My kernel config is:

ident           LYNX
machine         i386
makeoptions     DEBUG=-g
options         INCLUDE_CONFIG_FILE
cpu             I686_CPU
options         SCHED_4BSD
options         ADAPTIVE_GIANT
options         PREEMPTION
device          apic
options       	COMPAT_43
options       	COMPAT_43TTY
options         COMPAT_FREEBSD4
options         COMPAT_FREEBSD5
options      	COMPAT_FREEBSD6
options         SYSVSHM
options         SYSVSEM
options         SYSVMSG
options         KDB
options         KDB_TRACE
options         DDB
options         DDB_NUMSYM
options         SYSCTL_DEBUG
options         KTRACE
options         KTRACE_REQUEST_POOL=101
options         INVARIANTS
options         INVARIANT_SUPPORT
options         INET
options         FAST_IPSEC
options         IPSEC_FILTERGIF
device          ether
device          loop
device          bpf
device          ppp
options         PPP_BSDCOMP
options         PPP_DEFLATE
options         PPP_FILTER
options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_VERBOSE_LIMIT=100
options         IPFIREWALL_FORWARD
options         IPSTEALTH
options         FFS
options         SOFTUPDATES
options         UFS_EXTATTR
options         UFS_ACL
options         UFS_DIRHASH
options         QUOTA
options         _KPOSIX_PRIORITY_SCHEDULING
device          pty
device          crypto
device          pci
device          atkbdc
device          atkbd
device          psm
options         KBD_INSTALL_CDEV
device          vga
device          splash
device          sc
options         SC_HISTORY_SIZE=5000
options         SC_TWOBUTTON_MOUSE
device          ata
device          atadisk
options         ATA_STATIC_ID
device          scbus
device          da
device          cd
device          pass

To demonstrate the problem, I've forced an artificial panic from the
singleuser mode using the 'sysctl debug.kdb.panic=1' command, and then
typing 'panic' from ddb. kgdb against the resulting kernel dump fails to print 
complete backtrace:

(kgdb) where
#0  doadump () at pcpu.h:166
#1  0xc04aad4c in boot (howto=260)
     at /mnt3/usr/CURRENT/src/sys/kern/kern_shutdown.c:411
#2  0xc04aaff7 in panic (fmt=0xc05ffbbf "from debugger")
     at /mnt3/usr/CURRENT/src/sys/kern/kern_shutdown.c:567
#3  0xc044238e in db_panic (addr=-1068723113, have_addr=0, count=-1,
     modif=0xe4b4795c "") at /mnt3/usr/CURRENT/src/sys/ddb/db_command.c:433
#4  0xc0442327 in db_command (last_cmdp=0xc0660a04, cmd_table=0x0)
     at /mnt3/usr/CURRENT/src/sys/ddb/db_command.c:401
#5  0xc04423e2 in db_command_loop ()
     at /mnt3/usr/CURRENT/src/sys/ddb/db_command.c:453
#6  0xc044402d in db_trap (type=3, code=0)
     at /mnt3/usr/CURRENT/src/sys/ddb/db_main.c:222
#7  0xc04c96d1 in kdb_trap (type=3, code=0, tf=0x0)
     at /mnt3/usr/CURRENT/src/sys/kern/subr_kdb.c:502
#8  0xc05ddfc1 in trap (frame=0xe4b47aec)
     at /mnt3/usr/CURRENT/src/sys/i386/i386/trap.c:621
#9  0xc05ca84b in calltrap ()
     at /mnt3/usr/CURRENT/src/sys/i386/i386/exception.s:139
#10 0x00000000 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

However, if I turn ddb off with the 'sysctl debug.debugger_on_panic=0' and
then obtain a kernel dump via 'sysctl debug.kdb.panic=1', kgdb completely
unwinds the stack from the resulting dump:

(kgdb) where
#0  doadump () at pcpu.h:166
#1  0xc04aad4c in boot (howto=260)
     at /mnt3/usr/CURRENT/src/sys/kern/kern_shutdown.c:411
#2  0xc04aaff7 in panic (fmt=0xc060d1bd "kdb_sysctl_panic")
     at /mnt3/usr/CURRENT/src/sys/kern/kern_shutdown.c:567
#3  0xc04c929e in kdb_sysctl_panic (oidp=0xc06435c0, arg1=0x0, arg2=0,
     req=0xe4b78ba4) at /mnt3/usr/CURRENT/src/sys/kern/subr_kdb.c:182
#4  0xc04b3073 in sysctl_root (oidp=0x0, arg1=0x0, arg2=0, req=0xe4b78ba4)
     at /mnt3/usr/CURRENT/src/sys/kern/kern_sysctl.c:1282
#5  0xc04b3244 in userland_sysctl (td=0x0, name=0xe4b78c24, namelen=3,
     old=0xe4b78ba4, oldlenp=0x0, inkernel=0, new=0xbfbfe4bc, newlen=0,
     retval=0xe4b78c20, flags=0)
     at /mnt3/usr/CURRENT/src/sys/kern/kern_sysctl.c:1381
#6  0xc04b30fb in __sysctl (td=0xc3cf91b0, uap=0xe4b78d00)
     at /mnt3/usr/CURRENT/src/sys/kern/kern_sysctl.c:1316
#7  0xc05de786 in syscall (frame=0xe4b78d38)
     at /mnt3/usr/CURRENT/src/sys/i386/i386/trap.c:1008
#8  0xc05ca8b0 in Xint0x80_syscall ()
     at /mnt3/usr/CURRENT/src/sys/i386/i386/exception.s:196
#9  0x48133747 in ?? ()
Previous frame inner to this frame (corrupt stack?)

I've tried to repeat this under the RELENG_6 as of 30-Oct (just removing
kernel options COMPAT_43TTY, COMPAT_FREEBSD6, INVARIANTS, INVARIANT_SUPPORT,
and using 'debug.kdb.enter' instead of 'debug.kdb.panic') - and kgdb unwinds
the stack even if the dump got via typing 'panic' from ddb:

(kgdb) where
#0  doadump () at pcpu.h:165
#1  0xc049eb46 in boot (howto=260)
     at /usr/RELENG_6/src/sys/kern/kern_shutdown.c:409
#2  0xc049ee0c in panic (fmt=0xc05fabc0 "from debugger")
     at /usr/RELENG_6/src/sys/kern/kern_shutdown.c:565
#3  0xc043fcd5 in db_panic (addr=-1068795853, have_addr=0, count=-1,
     modif=0xe526ba2c "") at /usr/RELENG_6/src/sys/ddb/db_command.c:438
#4  0xc043fc6c in db_command (last_cmdp=0xc06483a4, cmd_table=0x0,
     aux_cmd_tablep=0xc061ac3c, aux_cmd_tablep_end=0xc061ac40)
     at /usr/RELENG_6/src/sys/ddb/db_command.c:350
#5  0xc043fd34 in db_command_loop ()
     at /usr/RELENG_6/src/sys/ddb/db_command.c:458
#6  0xc0441949 in db_trap (type=3, code=0)
     at /usr/RELENG_6/src/sys/ddb/db_main.c:222
#7  0xc04b7aaf in kdb_trap (type=3, code=0, tf=0xe526bb6c)
     at /usr/RELENG_6/src/sys/kern/subr_kdb.c:473
#8  0xc05da99c in trap (frame=
       {tf_fs = -450494456, tf_es = -1068826584, tf_ds = -1067450328, tf_edi =
    -450446332, tf_esi = 0, tf_ebp = -450446420, tf_isp = -450446440, tf_ebx =
    -450446332, tf_edx = 0, tf_ecx = -1056878592, tf_eax = 35, tf_trapno = 3,
    tf_err = 0, tf_eip = -1068795853, tf_cs = 32, tf_eflags = 646, tf_esp =
    -450446400, tf_ss =
    -1068796128}) at /usr/RELENG_6/src/sys/i386/i386/trap.c:594
#9  0xc05c915a in calltrap ()
     at /usr/RELENG_6/src/sys/i386/i386/exception.s:139
#10 0xc04b7833 in kdb_enter (msg=0x23 <Address 0x23 out of bounds>)
     at cpufunc.h:60
#11 0xc04b7720 in kdb_sysctl_enter (oidp=0xc062cf60, arg1=0x0, arg2=0,
     req=0xe526bc04) at /usr/RELENG_6/src/sys/kern/subr_kdb.c:175
#12 0xc04a6bb3 in sysctl_root (oidp=0x0, arg1=0x0, arg2=0, req=0xe526bc04)
     at /usr/RELENG_6/src/sys/kern/kern_sysctl.c:1281
#13 0xc04a6db0 in userland_sysctl (td=0x23, name=0xe526bc74, namelen=3,
     old=0xe526bc04, oldlenp=0x0, inkernel=0, new=0xbfbfe4ac, newlen=35,
     retval=0xe526bc70, flags=35)
     at /usr/RELENG_6/src/sys/kern/kern_sysctl.c:1380
#14 0xc04a6c53 in __sysctl (td=0xc4d0ea80, uap=0xe526bd04)
     at /usr/RELENG_6/src/sys/kern/kern_sysctl.c:1315
#15 0xc05db1f3 in syscall (frame=
       {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 3, tf_esi = 0, tf_ebp =
    -1077943272, tf_isp = -450445980, tf_ebx = 1209310688, tf_edx = 0, tf_ecx =
    -1077941056, tf_eax = 202, tf_trapno = 12, tf_err = 2, tf_eip =
    1209155743, tf_cs = 51, tf_eflags = 658, tf_esp = -1077943332, tf_ss = 59})
     at /usr/RELENG_6/src/sys/i386/i386/trap.c:983
#16 0xc05c91af in Xint0x80_syscall ()
     at /usr/RELENG_6/src/sys/i386/i386/exception.s:200
#17 0x00000033 in ?? ()

So it looks like a regression in CURRENT vs RELENG_6 (either ddb 'spoils' the 
stack somehow, or kgdb fails to unwind it).


Sincerely, Dmitry
-- 
Atlantis ISP, System Administrator
e-mail:  dmitry_at_atlantis.dp.ua
nic-hdl: LYNX-RIPE
Received on Tue Dec 19 2006 - 16:52:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC