On 11 Aug, Konstantin Belousov wrote: > On Wed, Aug 10, 2016 at 04:47:15PM -0700, Don Lewis wrote: >> On 10 Aug, Jung-uk Kim wrote: >> > On 08/09/16 05:12 AM, Konstantin Belousov wrote: >> >> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote: >> >>> On 8 Aug, Konstantin Belousov wrote: >> >>>> On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote: >> >>>>> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote: >> >>>>>> Reposted to -current to get some more eyes on this ... >> >>>>>> >> >>>>>> I just got a kernel panic when I started up a CentOS 7 VM in virtualbox. >> >>>>>> The host is: >> >>>>>> FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64 >> >>>>>> The virtualbox version is: >> >>>>>> virtualbox-ose-5.0.26 >> >>>>>> virtualbox-ose-kmod-5.0.26_1 >> >>>>>> >> >>>>>> The panic message is: >> >>>>>> >> >>>>>> panic: Unregistered use of FPU in kernel >> >>>>>> cpuid = 1 >> >>>>>> KDB: stack backtrace: >> >>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe085a55d030 >> >>>>>> vpanic() at vpanic+0x182/frame 0xfffffe085a55d0b0 >> >>>>>> kassert_panic() at kassert_panic+0x126/frame 0xfffffe085a55d120 >> >>>>>> trap() at trap+0x7ae/frame 0xfffffe085a55d330 >> >>>>>> calltrap() at calltrap+0x8/frame 0xfffffe085a55d330 >> >>>>>> --- trap 0x16, rip = 0xffffffff827dd3a9, rsp = 0xfffffe085a55d408, rbp = 0xfffffe085a55d430 --- >> >>>>>> g_pLogger() at 0xffffffff827dd3a9/frame 0xfffffe085a55d430 >> >>>>>> g_pLogger() at 0xffffffff8274e5c7/frame 0x3 >> >>>>>> KDB: enter: panic >> >>>>>> >> >>>>>> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is >> >>>>>> the trigger. >> >>>>>> >> >>>>>> There are no symbols for the virtualbox kmods, possibly because I >> >>>>>> installed them as an upgrade using packages (built with the same source >> >>>>>> tree version) instead of by using PORTS_MODULES in make.conf, so ports >> >>>>>> kgdb didn't have anything useful to say about what happened before the >> >>>>>> trap. >> >>>>>> >> >>>>>> This panic is very repeatable. I just got another one when starting the >> >>>>>> same VM., but this time the two calls before the trap were >> >>>>>> null_bug_bypass(). Hmn, that symbol is in nullfs ... >> >>>>>> >> >>>>>> I don't see this with a Windows 7 VM. >> >>>>>> >> >>>>>> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse >> >>>>>> -msoft-float -mno-aes -mno-avx >> >>>> Your disassemble listed fxrstor instruction that failing, or did I >> >>>> mis-remembered ? This is most likely some context switch code, either >> >>>> by virtual machine or erronously executed guest code. It is not a >> >>>> spontaneous use of FPU, but more likely something different. Can you >> >>>> confirm ? >> >>>> >> >>>> In either case, I do not remember any KBI changes around PCB layout or >> >>>> fpu_enter() KPI recently. >> >>>> >> >>>>> >> >>>>> I suspect head packages are quite likely built against the a "wrong" KBI >> >>>>> and are too fragile to use for kmods vs compiling from ports. :-/ I would >> >>>>> try a built-from-ports kmod to see if the panics go away. >> >>>> >> >>>> FWIW, I will commit the following change shortly. Since third-party >> >>>> modules break the invariant, either due to bugs (ndis wrappers) or >> >>>> possibly due to KBI breakage, it is worth to have the detection enabled >> >>>> for production kernels. >> >>> >> >>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a >> >>> GENERIC kernel and the guest seemed to operate properly. Then I enabled >> >>> INVARIANTS and got the panic. I suspect that is why nobody has stumbled >> >>> across this before. >> >>> >> >> This is yet another reason to promote KASSERT to the full panic. >> >> I expect that the vbox source lacks fpu_kern_enter() calls around the >> >> FPU state restoration. >> > >> > Unfortunately, the code is in MI source as it is unnecessary for >> > supported OSes (read: FreeBSD is not supported) and it's not easy to >> > inject fpu_kern_enter()/fpu_kern_leave() calls there. :-( >> >> It's a headache, but our ports can use patch files for that sort of >> thing ... > > Note that it is, most likely, completely useless to wrap single > FXRSTOR instruction into the fpu_kern_enter() braces. The purpose of > the instruction is to load ('legacy', as they call it, no AVX+) FPU state > into the machine context. If you put fpu_kern_leave() right after > the instruction, the context is flushed. Since it looks like the code is preparing to re-enter the guest, then calling fpu_kern_leave() doesn't make sense. > There must be some larger scope where the braces do make sense. And since > some other OSes do require similar precautions around the in-kernel FPU > access, I suspect that there should be some common place to put our KPI > calls. CPUMSetGuestXcr0() is the first stack frame. It wouldn't seem to make sense to call fpu_kern_enter() unless ASMXRstor() is going to be called, and the tests for that are right before the call. However, the comments above this function say: * Will load additional state if the FPU state is already loaded (in ring-0 & * raw-mode context). so it does look like something wasn't done before we got to this point.Received on Thu Aug 11 2016 - 20:22:56 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC