Re: kernel panic caused by virtualbox(?)

From: Jung-uk Kim <jkim_at_FreeBSD.org> Date: Wed, 10 Aug 2016 19:40:31 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC

On 08/09/16 05:12 AM, Konstantin Belousov wrote:
> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
>> On  8 Aug, Konstantin Belousov wrote:
>>> On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
>>>> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
>>>>> Reposted to -current to get some more eyes on this ...
>>>>>
>>>>> I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
>>>>> The host is:
>>>>> 	FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
>>>>> The virtualbox version is:
>>>>> 	virtualbox-ose-5.0.26
>>>>> 	virtualbox-ose-kmod-5.0.26_1
>>>>>
>>>>> The panic message is:
>>>>>
>>>>> panic: Unregistered use of FPU in kernel
>>>>> cpuid = 1
>>>>> KDB: stack backtrace:
>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe085a55d030
>>>>> vpanic() at vpanic+0x182/frame 0xfffffe085a55d0b0
>>>>> kassert_panic() at kassert_panic+0x126/frame 0xfffffe085a55d120
>>>>> trap() at trap+0x7ae/frame 0xfffffe085a55d330
>>>>> calltrap() at calltrap+0x8/frame 0xfffffe085a55d330
>>>>> --- trap 0x16, rip = 0xffffffff827dd3a9, rsp = 0xfffffe085a55d408, rbp = 0xfffffe085a55d430 ---
>>>>> g_pLogger() at 0xffffffff827dd3a9/frame 0xfffffe085a55d430
>>>>> g_pLogger() at 0xffffffff8274e5c7/frame 0x3
>>>>> KDB: enter: panic
>>>>>
>>>>> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
>>>>> the trigger.
>>>>>
>>>>> There are no symbols for the virtualbox kmods, possibly because I
>>>>> installed them as an upgrade using packages (built with the same source
>>>>> tree version) instead of by using PORTS_MODULES in make.conf, so ports
>>>>> kgdb didn't have anything useful to say about what happened before the
>>>>> trap.
>>>>>
>>>>> This panic is very repeatable.  I just got another one when starting the
>>>>> same VM., but this time the two calls before the trap were
>>>>> null_bug_bypass().  Hmn, that symbol is in nullfs ...
>>>>>
>>>>> I don't see this with a Windows 7 VM.
>>>>>
>>>>> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
>>>>> -msoft-float -mno-aes -mno-avx
>>> Your disassemble listed fxrstor instruction that failing, or did I
>>> mis-remembered ? This is most likely some context switch code, either
>>> by virtual machine or erronously executed guest code. It is not a
>>> spontaneous use of FPU, but more likely something different. Can you
>>> confirm ?
>>>
>>> In either case, I do not remember any KBI changes around PCB layout or
>>> fpu_enter() KPI recently.
>>>
>>>>
>>>> I suspect head packages are quite likely built against the a "wrong" KBI
>>>> and are too fragile to use for kmods vs compiling from ports. :-/  I would
>>>> try a built-from-ports kmod to see if the panics go away.
>>>
>>> FWIW, I will commit the following change shortly. Since third-party
>>> modules break the invariant, either due to bugs (ndis wrappers) or
>>> possibly due to KBI breakage, it is worth to have the detection enabled
>>> for production kernels.
>>
>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
>> GENERIC kernel and the guest seemed to operate properly.  Then I enabled
>> INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
>> across this before.
>>
> This is yet another reason to promote KASSERT to the full panic.
> I expect that the vbox source lacks fpu_kern_enter() calls around the
> FPU state restoration.

Unfortunately, the code is in MI source as it is unnecessary for
supported OSes (read: FreeBSD is not supported) and it's not easy to
inject fpu_kern_enter()/fpu_kern_leave() calls there. :-(

Jung-uk Kim