Re: panic on application core dump?

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Sat, 21 Feb 2015 23:17:12 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:55 UTC

On Sat, Feb 21, 2015 at 12:27:22PM -0800, Sean Bruno wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Well, this is new.  It looks like current panic'd when trying to dump a
> core from a qemu crash?  I can leave this at the debugger for now as
> this is a machine doing mips package builds and is not "production".
> 
> sean
> 
> Thu Feb 19 18:50:59 UTC 2015
> 
> FreeBSD/amd64 (dirty.ysv.freebsd.org) (ttyu0)
> 
> login: Feb 20 08:06:05 dirty sshd[51311]: fatal: Read from socket
> failed: Connection reset by peer [preauth]
> Feb 20 16:47:29 dirty su: sbruno to root on /dev/pts/1
> Feb 21 02:15:44 dirty sshd[95051]: fatal: Read from socket failed:
> Connection reset by peer [preauth]
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 15; apic id = 35
> fault virtual address   = 0x380
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff809b2ed1
> stack pointer           = 0x28:0xfffffe046a3a30f0
> frame pointer           = 0x28:0xfffffe046a3a3170
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 42563 (qemu-mips64)
> [ thread pid 42563 tid 100956 ]
> Stopped at      __mtx_lock_sleep+0xb1:  movl    0x380(%rax),%ecx
> db> bt
> Tracing pid 42563 tid 100956 td 0xfffff80109a214a0
> __mtx_lock_sleep() at __mtx_lock_sleep+0xb1/frame 0xfffffe046a3a3170
> vref() at vref+0x6d/frame 0xfffffe046a3a31a0
> vn_fullpath1() at vn_fullpath1+0x62/frame 0xfffffe046a3a3200
> vn_fullpath_global() at vn_fullpath_global+0x6e/frame 0xfffffe046a3a3240
> sigexit() at sigexit+0xa22/frame 0xfffffe046a3a34f0
> sendsig() at sendsig+0x65e/frame 0xfffffe046a3a3960
> trapsignal() at trapsignal+0x2f7/frame 0xfffffe046a3a39e0
> trap() at trap+0x3ba/frame 0xfffffe046a3a3bf0
> calltrap() at calltrap+0x8/frame 0xfffffe046a3a3bf0
> - --- trap 0xc, rip = 0x600334bc, rsp = 0x7ffbffe19990, rbp =
> 0x7ffffffe4a20 ---
> db> p vref+0x6d
> ffffffff80a876cd

Err.  Is it easily reproducable in your setup ?
The core file vnode is indeed unreferenced before notification is sent.

Try this.

diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c
index 41da3dd..57f66b0 100644
--- a/sys/kern/kern_sig.c
+++ b/sys/kern/kern_sig.c
_at__at_ -3310,7 +3310,7 _at__at_ coredump(struct thread *td)
 	    vattr.va_nlink != 1 || (vp->v_vflag & VV_SYSTEM) != 0) {
 		VOP_UNLOCK(vp, 0);
 		error = EFAULT;
-		goto close;
+		goto out;
 	}

 	VOP_UNLOCK(vp, 0);
_at__at_ -3347,17 +3347,12 _at__at_ coredump(struct thread *td)
 		VOP_ADVLOCK(vp, (caddr_t)p, F_UNLCK, &lf, F_FLOCK);
 	}
 	vn_rangelock_unlock(vp, rl_cookie);
-close:
-	error1 = vn_close(vp, FWRITE, cred, td);
-	if (error == 0)
-		error = error1;
-	else
-		goto out;
+
 	/*
 	 * Notify the userland helper that a process triggered a core dump.
 	 * This allows the helper to run an automated debugging session.
 	 */
-	if (coredump_devctl == 0)
+	if (error != 0 || coredump_devctl == 0)
 		goto out;
 	len = MAXPATHLEN * 2 + sizeof(comm_name) - 1 +
 	    sizeof(' ') + sizeof(core_name) - 1;
_at__at_ -3377,6 +3372,9 _at__at_ close:
 	strlcat(data, fullpath, len);
 	devctl_notify("kernel", "signal", "coredump", data);
 out:
+	error1 = vn_close(vp, FWRITE, cred, td);
+	if (error == 0)
+		error = error1;
 #ifdef AUDIT
 	audit_proc_coredump(td, name, error);
 #endif