Re: DTrace panic while probing syscall::open (and possibly many others)

From: Thomas Backman <serenity_at_exscape.org>
Date: Thu, 21 May 2009 18:53:56 +0200
On May 13, 2009, at 03:19 PM, Thomas Backman wrote:

> OK, so I first posted a thread on the forums about this in 7.2- 
> RELEASE:
> http://forums.freebsd.org/showthread.php?t=3834
> Then filed a PR, kern/134408:
> http://www.freebsd.org/cgi/query-pr.cgi?pr=134408
>
> The very same bug remains in 8-CURRENT/amd64 as of May 13, ~10(am)
> GMT+2.
>
> Steps to reproduce:
> 1) Build DTrace capable kernel (I followed the wiki DTrace  
> instructions)
> 2) Reboot; kldload dtraceall
> 3) dtrace -n 'syscall::open:entry { self->path = arg0; }  
> syscall::open:return { printf("%s\n", copyinstr(self->path)); }'
> 4) Crash.
>
> Backtrace:
> [root_at_vmware /usr/obj/usr/src/sys/DTRACE]# kgdb kernel.debug /var/ 
> crash/vmcore.3
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and  
> you are
> welcome to change it and/or distribute copies of it under certain  
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for  
> details.
> This GDB was configured as "amd64-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
> panic: from debugger
> cpuid = 0
> Uptime: 3m10s
> Physical memory: 368 MB
> Dumping 81 MB: 66 50 34 18 2
>
> Reading symbols from /boot/kernel/dtraceall.ko...Reading symbols  
> from /boot/kernel/dtraceall.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/dtraceall.ko
> Reading symbols from /boot/kernel/profile.ko...Reading symbols from / 
> boot/kernel/profile.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/profile.ko
> Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols  
> from /boot/kernel/opensolaris.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/opensolaris.ko
> Reading symbols from /boot/kernel/cyclic.ko...Reading symbols from / 
> boot/kernel/cyclic.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/cyclic.ko
> Reading symbols from /boot/kernel/dtrace.ko...Reading symbols from / 
> boot/kernel/dtrace.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/dtrace.ko
> Reading symbols from /boot/kernel/systrace.ko...Reading symbols  
> from /boot/kernel/systrace.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/systrace.ko
> Reading symbols from /boot/kernel/sdt.ko...Reading symbols from / 
> boot/kernel/sdt.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/sdt.ko
> Reading symbols from /boot/kernel/fbt.ko...Reading symbols from / 
> boot/kernel/fbt.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/fbt.ko
> Reading symbols from /boot/kernel/dtnfsclient.ko...Reading symbols  
> from /boot/kernel/dtnfsclient.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/dtnfsclient.ko
> Reading symbols from /boot/kernel/dtmalloc.ko...Reading symbols  
> from /boot/kernel/dtmalloc.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/dtmalloc.ko
> #0  doadump () at pcpu.h:223
> 223		__asm __volatile("movq %%gs:0,%0" : "=r" (td));
> (kgdb) bt
> #0  doadump () at pcpu.h:223
> #1  0xffffffff80566b23 in boot (howto=260) at /usr/src/sys/kern/ 
> kern_shutdown.c:420
> #2  0xffffffff80566fac in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:576
> #3  0xffffffff801d3ef7 in db_panic (addr=Variable "addr" is not  
> available.
> ) at /usr/src/sys/ddb/db_command.c:478
> #4  0xffffffff801d43a1 in db_command (last_cmdp=0xffffffff80bd3620,  
> cmd_table=Variable "cmd_table" is not available.
> ) at /usr/src/sys/ddb/db_command.c:445
> #5  0xffffffff801d45f0 in db_command_loop () at /usr/src/sys/ddb/ 
> db_command.c:498
> #6  0xffffffff801d6599 in db_trap (type=Variable "type" is not  
> available.
> ) at /usr/src/sys/ddb/db_main.c:229
> #7  0xffffffff80597135 in kdb_trap (type=10, code=0,  
> tf=0xfffffffe4e64e450) at /usr/src/sys/kern/subr_kdb.c:534
> #8  0xffffffff80843f81 in trap (frame=0xfffffffe4e64e450) at /usr/ 
> src/sys/amd64/amd64/trap.c:606
> #9  0xffffffff8081edc7 in calltrap () at /usr/src/sys/amd64/amd64/ 
> exception.S:223
> #10 0xffffffff8123c128 in dtrace_panic (format=Variable "format" is  
> not available.
> )
>    at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/ 
> opensolaris/uts/common/dtrace/dtrace.c:601
> #11 0xffffffff8123c200 in dtrace_copycheck  
> (uaddr=18446744071581326184, kaddr=Variable "kaddr" is not available.
> ) at dtrace_isa.c:527
> #12 0xffffffff8123c2bc in dtrace_copyinstr (uaddr=34365395808,  
> kaddr=18446744066201920856, size=256,
>    flags=0xffffffff8122f120) at dtrace_isa.c:558
> #13 0xffffffff81249e84 in dtrace_dif_emulate  
> (difo=0xffffff00026a2d80, mstate=0xfffffffe4e64ea00,
>    vstate=0xffffff0002548838, state=0xffffff0002548800)
>    at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/ 
> opensolaris/uts/common/dtrace/dtrace.c:3446
> #14 0xffffffff8124b20a in dtrace_probe (id=Variable "id" is not  
> available.
> )
>    at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/ 
> opensolaris/uts/common/dtrace/dtrace.c:6220
> #15 0xffffffff8137b155 in systrace_probe () from /boot/kernel/ 
> systrace.ko
> #16 0xffffffff80843c4d in syscall (frame=0xfffffffe4e64ec90) at /usr/ 
> src/sys/amd64/amd64/trap.c:990
> #17 0xffffffff8081f050 in Xfast_syscall () at /usr/src/sys/amd64/ 
> amd64/exception.S:364
> #18 0x00000008005411fc in ?? ()
> Previous frame inner to this frame (corrupt stack?)
>
> Hope this helps to fix this bug - I assume syscall::open isn't the  
> only probe
> affected as it's simply the very first one I tried.
>
> Same panic on two computers (a "real" one, A64 3200+, nForce4, 2GB  
> RAM;
> and a Macbook Pro C2D running VMware Fusion). Same panic in 7.2 and  
> 8.0.

OK, a follow-up on this one...
Let me preface it: WTF?

OK, now that's out of my system. Well, not really.
So, I'm not a programmer. I program mostly high-level languages (such  
as Objective-C/Cocoa for OS X, some python, perl etc) and have NO clue  
what I'm doing in kernel modules....

BUT... Being curious, I tried to see if I could add some printf  
statements to dtrace_copycheck() to see what it'd print before  
crashing. What happens?
Well, this is what I changed (the types may well be screwed up - as I  
said, I'm not a competent C programmer! Heck, I can see right now that  
they are):
In /sys/cddl/dev/dtrace/amd64/dtrace_isa.c:
static int
dtrace_copycheck(uintptr_t uaddr, uintptr_t kaddr, size_t size)
{
     printf("in dtrace_copycheck(), pre-ASSERT:\n");
     printf("kaddr = %u, kernelbase = %u, size = %d, kaddr+size = %u\n 
\n",
        (unsigned int)kaddr, (unsigned int)kernelbase, (unsigned  
int)size, (unsigned int)(kaddr+size));
     ASSERT(kaddr >= kernelbase && kaddr + size >= kaddr);

     if (uaddr + size >= kernelbase || uaddr + size < uaddr) {
     ...

So, I added two printf statements. What happens? IT BECOMES STABLE.  
How the hell does this happen? I figured "oh, someone must have fixed  
the bug!", commented my lines, unloaded the modules, recompiled, re- 
loaded "my" modules without printf... PANIC! I tried it again with the  
printf, and... it works. This deserves a big "WTF" in my opinion. Any  
clues as to what's going on here?

In the meantime, I added
printf("%s", "");
above the ASSERT line, which fixes the bug! I'm as confused as one can  
get here, could I get some insights as to why this works?! :-)
Does having something above the ASSERT line make the assertion not  
execute?

Regards,
Thomas
Received on Thu May 21 2009 - 14:54:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:48 UTC