Re: Reproducable Panic on CURRENT and 6.0-RELEASE

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 19 Dec 2005 14:46:47 -0500
On Friday 16 December 2005 07:03 pm, Anish Mistry wrote:
> On Friday 16 December 2005 04:38 pm, you wrote:
> > On Friday 16 December 2005 03:27 pm, Anish Mistry wrote:
> > > On Friday 16 December 2005 03:11 pm, you wrote:
> > > > On Friday 16 December 2005 12:37 pm, Anish Mistry wrote:
> > > > > Here is the offending program/code.  The interesting program
> > > > > is avidemux_2.1_branch_anish/avidemux/avidemux2.
> > > > > (It is compiled for CURRENT, and I left all the object code
> > > > > stuff in so it's a bit large 21MB)
> > > > > http://am-productions.biz/docs/avidemux_2.1_branch_anish.tgz
> > > > >
> > > > > First you'll need to compile spidermonkey to be threadsafe so
> > > > > add the following to your lang/spidermonkey/Makefile before
> > > > > installing it: LIB_DEPENDS=    nspr4.1:${PORTSDIR}/devel/nspr
> > > > > MAKE_ARGS+=     JS_THREADSAFE=YES LDFLAGS="-L${LOCALBASE}/lib
> > > > > -lpthread -lm"
> > > > > CFLAGS+=        -I${LOCALBASE}/include/nspr
> > > > >
> > > > > Once a threadsafe spidermonkey is installed to kill the
> > > > > machine you'll need to:
> > > > > cd avidemux_2.1_branch_anish/avidemux
> > > > > ./avidemux2 --run new-features-test.js
> > > > >
> > > > > On CURRENT:
> > > > > kernel trap 12 with interrupts disabled
> > > > >
> > > > > Fatal trap 12: page fault while in kernel mode
> > > > > fault virtual address   = 0x68
> > > > > fault code              = supervisor read, page not present
> > > > > instruction pointer     = 0x20:0xc04e6f36
> > > > > stack pointer           = 0x28:0xcc9edb3c
> > > > > frame pointer           = 0x28:0xcc9edbb0
> > > > > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > > > >                         = DPL 0, pres 1, def32 1, gran 1
> > > > > processor eflags        = resume, IOPL = 0
> > > > > current process         = 798 (gdb)
> > > > > trap number             = 12
> > > > > panic: page fault
> > > > >
> > > > > #0  doadump () at pcpu.h:165
> > > > > #1  0xc04bb7eb in boot (howto=260)
> > > > > at /usr/src/sys/kern/kern_shutdown.c:399
> > > > > #2  0xc04bb353 in panic (fmt=0xc06069a7 "%s")
> > > > >     at /usr/src/sys/kern/kern_shutdown.c:555
> > > > > #3  0xc05e91ba in trap_fatal (frame=0xcc9edafc, eva=104)
> > > > >     at /usr/src/sys/i386/i386/trap.c:862
> > > > > #4  0xc05e96d9 in trap (frame=
> > > > >       {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi =
> > > > > -1032878460, tf_esi = 1, tf_ebp = -862004304, tf_isp =
> > > > > -862004440, tf_ebx = -1033297504, tf_edx = -1033987232,
> > > > > tf_ecx = 4, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip =
> > > > > -1068601546, tf_cs = 32, tf_eflags = 65687, tf_esp =
> > > > > -1032878356, tf_ss = -1067380424}) at
> > > > > /usr/src/sys/i386/i386/trap.c:273
> > > > > #5  0xc05db6fa in calltrap ()
> > > > > at /usr/src/sys/i386/i386/exception.s:137
> > > > > #6  0xc04e6f36 in kern_ptrace (td=0xc25e9b60, req=10, pid=1,
> > > > > addr=0x0, data=17)
> > > > >     at /usr/src/sys/kern/sys_process.c:802
> > > >
> > > > On HEAD this is:
> > > > 				p->p_xthread->td_flags &= ~TDF_XSIG;
> > > >
> > > > If two threads called kern_ptrace() with the same PID and this
> > > > could happen. Hmm, I have no idea how p_xthread is supposed to
> > > > not be racey here in fact. It would be helpful to know what
> > > > PTRACE action it it is trying to do and maybe a KTR trace of
> > > > the various ptrace events leading up to this condition. I have
> > > > no idea what thread you are supposed to act on if p_xthread is
> > > > NULL either.
> > >
> > > How would I do this?  My kdb/ddb skills are prettymuch limited to
> > > getting a backtrace.
> >
> > You could add some new KTR tracepoints to log each request into
> > kern_ptrace() and then do a 'show ktr' at the ddb prompt.
>
> I put a KTR_GEN tracepoint in kern_ptrace and only got 1 entry in the
> log:
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x68
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc04ed896
> stack pointer           = 0x28:0xcc9a9b3c
> frame pointer           = 0x28:0xcc9a9bb0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = resume, IOPL = 0
> current process         = 697 (gdb)
> [thread pid 697 tid 100073 ]
> Stopped at      kern_ptrace+0xef6:      movl    0x68(%eax),%ebx
> db> show ktr
> 0 (0xc2354b60): kern_ptrace: td=0xc2354b60 req=0xa pid=695 addr==0x0
> data==0x0

Ok, so it's doing a PT_ATTACH on pid 695.

> --- End of trace buffer ---
> db>
>
> The full alltrace:
> http://am-productions.biz/docs/ktr-trace.txt.gz
> From alltrace results for pid 695 is:
> db> bt
> Tracing pid 697 tid 100073 td 0xc2354b60
> kern_ptrace(c2354b60,a,2b7,0,11) at kern_ptrace+0xef6
> ptrace(c2354b60,cc9a9d04,4,0,23) at ptrace+0x40
> syscall(3b,3b,3b,81e9438,2b7) at syscall+0x19a
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (26, FreeBSD ELF32, ptrace), eip = 0x282c1a6b, esp =
> 0xbfbfe468, ebp = 0xbfbfe480 ---
> db> alltrace
>
> Tracing command gdb pid 697 tid 100073 td 0xc2354b60
> kern_ptrace(c2354b60,a,2b7,0,11) at kern_ptrace+0xef6
> ptrace(c2354b60,cc9a9d04,4,0,23) at ptrace+0x40
> syscall(3b,3b,3b,81e9438,2b7) at syscall+0x19a
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (26, FreeBSD ELF32, ptrace), eip = 0x282c1a6b, esp =
> 0xbfbfe468, ebp = 0xbfbfe480 ---
>
> Tracing command avidemux2 pid 695 tid 100080 td 0xc2635680
> sched_switch(c2635680,0,2,2ffd8312,ea5ba7fb) at sched_switch+0xb5
> mi_switch(2,0,c2635680,ac,c0619941) at mi_switch+0x259
> uio_yield(0,0,47000,0,c25f0074) at uio_yield+0x72
> vn_rdwr_inchunks(1,c2642840,89b1000,b37000,47000,0,0,101,c2640c00,0,0,c2635
>680) at vn_rdwr_inchunks+0xb4
> elf32_coredump(c2635680,c2642840,ffffffff,7fffffff) at
> elf32_coredump+0x132
> sigexit(c2635680,6,c2634294,8,c0618e65) at sigexit+0x8df
> kse_thr_interrupt(c2635680,cca0dd04,3,0,0) at kse_thr_interrupt+0x10c
> syscall(3b,3b,3b,20,0) at syscall+0x19a
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (382, FreeBSD ELF32, kse_thr_interrupt), eip = 0x28fe5603,
> esp = 0xbf8fdaec, ebp = 0xbf8fdb60 ---
>
> Tracing command avidemux2 pid 695 tid 100078 td 0xc26359c0
> sched_switch(c26359c0,c2635820,1,82b08812,3b03415f) at
> sched_switch+0xb5
> mi_switch(1,c2635820,0,c26359c0,cca13ba0) at mi_switch+0x259
> sleepq_switch(0,cca13bd0,c04c5896,c263422c,0) at sleepq_switch+0xc2
> sleepq_wait_sig(c263422c,0,100,c0618588,31f) at sleepq_wait_sig+0xc
> msleep(c263422c,c2634294,15c,c0620da6,0) at msleep+0x356
> kern_wait(c26359c0,2b8,cca13c28,0,0) at kern_wait+0x350
> wait4(c26359c0,cca13d04,4,0,0) at wait4+0x2d
> syscall(3b,3b,3b,94f1000,bfbfde90) at syscall+0x19a
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (7, FreeBSD ELF32, wait4), eip = 0x2903a067, esp =
> 0xbfbfdc04, ebp = 0xbfbfdc1c ---
>
> Tracing command avidemux2 pid 695 tid 100077 td 0xc2635b60
> sched_switch(c2635b60,0,1,cbf0f192,13141da1) at sched_switch+0xb5
> mi_switch(1,0,0,c2635b60,cca16c04) at mi_switch+0x259
> sleepq_switch(0,c2635b60,cca16c38,c04c595a,c26342b4) at
> sleepq_switch+0xc2
> sleepq_timedwait_sig(c26342b4) at sleepq_timedwait_sig+0xd
> msleep(c26342b4,c2634294,168,c0618e91,bb9) at msleep+0x41a
> kse_release(c2635b60,cca16d04,1,0,1) at kse_release+0xb8
> syscall(3b,3b,3b,81,97c3200) at syscall+0x19a
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (383, FreeBSD ELF32, kse_release), eip = 0x28fe55c3, esp =
> 0xbf9fef78, ebp = 0xbf9fefa8 ---

Given that one thread is doing a coredump, I bet someone tried to enter single 
threading mode, and single threading mode sets P_STOPPED_SINGLE _without_ 
setting p_xthread, thus P_SHOULDSTOP() is true, but p_xthread is NULL.  I 
guess thread_single() should set both p_singlethread and p_xthread?

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Mon Dec 19 2005 - 18:46:31 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:49 UTC