Re: kernel panic caused by virtualbox(?)

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Mon, 8 Aug 2016 21:37:43 +0300
On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> > Reposted to -current to get some more eyes on this ...
> > 
> > I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
> > The host is:
> > 	FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> > The virtualbox version is:
> > 	virtualbox-ose-5.0.26
> > 	virtualbox-ose-kmod-5.0.26_1
> > 
> > The panic message is:
> > 
> > panic: Unregistered use of FPU in kernel
> > cpuid = 1
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe085a55d030
> > vpanic() at vpanic+0x182/frame 0xfffffe085a55d0b0
> > kassert_panic() at kassert_panic+0x126/frame 0xfffffe085a55d120
> > trap() at trap+0x7ae/frame 0xfffffe085a55d330
> > calltrap() at calltrap+0x8/frame 0xfffffe085a55d330
> > --- trap 0x16, rip = 0xffffffff827dd3a9, rsp = 0xfffffe085a55d408, rbp = 0xfffffe085a55d430 ---
> > g_pLogger() at 0xffffffff827dd3a9/frame 0xfffffe085a55d430
> > g_pLogger() at 0xffffffff8274e5c7/frame 0x3
> > KDB: enter: panic
> > 
> > Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
> > the trigger.
> > 
> > There are no symbols for the virtualbox kmods, possibly because I
> > installed them as an upgrade using packages (built with the same source
> > tree version) instead of by using PORTS_MODULES in make.conf, so ports
> > kgdb didn't have anything useful to say about what happened before the
> > trap.
> > 
> > This panic is very repeatable.  I just got another one when starting the
> > same VM., but this time the two calls before the trap were
> > null_bug_bypass().  Hmn, that symbol is in nullfs ...
> > 
> > I don't see this with a Windows 7 VM.
> > 
> > All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> > -msoft-float -mno-aes -mno-avx
Your disassemble listed fxrstor instruction that failing, or did I
mis-remembered ? This is most likely some context switch code, either
by virtual machine or erronously executed guest code. It is not a
spontaneous use of FPU, but more likely something different. Can you
confirm ?

In either case, I do not remember any KBI changes around PCB layout or
fpu_enter() KPI recently.

> 
> I suspect head packages are quite likely built against the a "wrong" KBI
> and are too fragile to use for kmods vs compiling from ports. :-/  I would
> try a built-from-ports kmod to see if the panics go away.

FWIW, I will commit the following change shortly. Since third-party
modules break the invariant, either due to bugs (ndis wrappers) or
possibly due to KBI breakage, it is worth to have the detection enabled
for production kernels.

diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
index 1b85b32..04c5dcc 100644
--- a/sys/amd64/amd64/trap.c
+++ b/sys/amd64/amd64/trap.c
_at__at_ -443,8 +443,8 _at__at_ trap(struct trapframe *frame)
 			goto out;
 
 		case T_DNA:
-			KASSERT(!PCB_USER_FPU(td->td_pcb),
-			    ("Unregistered use of FPU in kernel"));
+			if (PCB_USER_FPU(td->td_pcb))
+				panic("Unregistered use of FPU in kernel");
 			fpudna();
 			goto out;
 
diff --git a/sys/i386/i386/trap.c b/sys/i386/i386/trap.c
index 40f7204..c540a49 100644
--- a/sys/i386/i386/trap.c
+++ b/sys/i386/i386/trap.c
_at__at_ -540,8 +540,8 _at__at_ trap(struct trapframe *frame)
 
 		case T_DNA:
 #ifdef DEV_NPX
-			KASSERT(!PCB_USER_FPU(td->td_pcb),
-			    ("Unregistered use of FPU in kernel"));
+			if (PCB_USER_FPU(td->td_pcb))
+				panic("Unregistered use of FPU in kernel");
 			if (npxdna())
 				goto out;
 #endif
Received on Mon Aug 08 2016 - 16:37:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC