Re: panic: Unregistered use of FPU in kernel

From: Alan Somers <asomers_at_freebsd.org> Date: Thu, 26 Sep 2019 14:51:29 -0600 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC

You're right, cem.  gdb and ddb show the same data.  Here it is from gdb:
0xffffffff8113b30e <sse42_crc32c+142>: 0xf2 0x48 0x0f 0x38 0xf1 0xde 0xf2
0x48
0xffffffff8113b316 <sse42_crc32c+150>: 0x0f 0x38 0xf1 0xc7 0x48 0x8b 0x32
0xf2
0xffffffff8113b31e <sse42_crc32c+158>: 0x4c 0x0f 0x38 0xf1 0xde 0x48 0x8d
0x72
0xffffffff8113b326 <sse42_crc32c+166>: 0x08 0x48 0x81 0xc2 0x08 0xff 0xff
0xff
0xffffffff8113b32e <sse42_crc32c+174>: 0x4c 0x39 0xca 0x72 0xcd 0x44 0x0f
0xb6
0xffffffff8113b336 <sse42_crc32c+182>: 0xc9 0x0f 0xb6 0xfd 0x89 0xca 0xc1
0xea
0xffffffff8113b33e <sse42_crc32c+190>: 0x10 0x0f 0xb6 0xd2 0xc1 0xe9 0x18
0x42
0xffffffff8113b346 <sse42_crc32c+198>: 0x33 0x1c 0x8d 0x80 0x11 0xf9 0x81
0x33

Here are the last few console messages from before the panic:
virtio_pci1: <VirtIO PCI Console adapter> port 0xc000-0xc03f mem
0xfc098000-0xfc
098fff,0xfebf4000-0xfebf7fff irq 10 at device 6.0 on pci0
virtio_pci2: <VirtIO PCI Block adapter> port 0xc040-0xc07f mem
0xfc099000-0xfc09
9fff,0xfebf8000-0xfebfbfff irq 11 at device 7.0 on pci0
vtblk0: <VirtIO Block Adapter> on virtio_pci2
vtblk0: 34816MB (71303296 512 byte sectors)
virtio_pci3: <VirtIO PCI Balloon adapter> port 0xc120-0xc13f mem
0xfebfc000-0xfe
bfffff irq 11 at device 8.0 on pci0
vtballoon0: <VirtIO Balloon Adapter> on virtio_pci3
acpi_syscontainer0: <System Container> on acpi0
acpi_syscontainer1: <System Container> port 0xaf00-0xaf0b on acpi0
acpi_syscontainer2: <System Container> port 0xafe0-0xafe3 on acpi0
acpi_syscontainer3: <System Container> port 0xae00-0xae13 on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
panic: Unregistered use of FPU in kernel

On Thu, Sep 26, 2019 at 2:19 PM Conrad Meyer <cem_at_freebsd.org> wrote:

> This kinda just looks like ddb doesn't know how to disassemble crc32q?
>  Which might not be too surprising.
>
> Note that it also truncates the qword constant in "add" at +167/+0xa7.
> That one isn't corruption; just a DDB bug.
>
> Can you print the faulting %rip and dump a few bytes at that address
> in both ddb and gdb (assuming ddb can't disassemble crc32q)?
>
> Best,
> Conrad
>
> On Thu, Sep 26, 2019 at 1:12 PM Alan Somers <asomers_at_freebsd.org> wrote:
> >
> > On Thu, Sep 26, 2019 at 11:29 AM Konstantin Belousov <
> kostikbel_at_gmail.com>
> > wrote:
> >
> > > On Thu, Sep 26, 2019 at 11:20:51AM -0600, Alan Somers wrote:
> > > > On Thu, Sep 26, 2019 at 11:02 AM Konstantin Belousov <
> > > kostikbel_at_gmail.com>
> > > > wrote:
> > > >
> > > > > On Thu, Sep 26, 2019 at 09:45:43AM -0600, Alan Somers wrote:
> > > > > > The latest VM snapshot
> > > > > (FreeBSD-13.0-CURRENT-amd64-20190920-r352544.qcow2)
> > > > > > instapanics on boot:
> > > > > >
> > > > > > panic: Unregistered use of FPU in kernel
> > > > > >
> > > > > > stack trace:
> > > > > > ...
> > > > > > sse42_crc32c
> > > > > > readsuper
> > > > > > ffs_sbget
> > > > > > g_label_ufs_taste_common
> > > > > > g_label_taste
> > > > > > g_new_provider_event
> > > > > > g_run_events
> > > > > > fork_exit
> > > > > > ...
> > > > > >
> > > > > > Has anybody touched this area recently?  I'll try to narrow down
> the
> > > > > commit
> > > > > > range.
> > > > >
> > > > > Start with disassembling the faulting instruction.  I suspect that
> > > somehow
> > > > > vital compiler switches like -mno-sse got omitted in the build.
> > > > >
> > > >
> > > > No problem with compiler switches here.  The C file uses inline
> assembly
> > > to
> > > > generate a crc32q instruction, in crc32_sse42.c:257.  But why would
> that
> > > > generate a floating point exception?  The instruction doesn't appear
> to
> > > be
> > > > using any floating point registers.  This is on a Kaby Lake CPU.
> > > >
> > > > crc32q %rsi, %rbx
> > >
> > > No idea, this instruction does not generate #NP at all.
> > >
> > > Provide exact script of the panic and backtrace,
> > > together with the disassembly of the function which contained the
> faulted
> > > instruction.  Do disassemble from ddb, in case text was corrupted.
> > >
> >
> > Ok, here's the full stack trace:
> >  #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> > #1  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:392
> > #2  0xffffffff804a1edb in db_dump (dummy=<optimized out>,
> >     dummy2=<optimized out>, dummy3=<unavailable>, dummy4=<unavailable>)
> >     at /usr/src/sys/ddb/db_command.c:575
> > #3  0xffffffff804a1c8f in db_command (last_cmdp=<optimized out>,
> >     cmd_table=<optimized out>, dopager=1) at
> > /usr/src/sys/ddb/db_command.c:482
> > #4  0xffffffff804a1a04 in db_command_loop ()
> >     at /usr/src/sys/ddb/db_command.c:535
> > #5  0xffffffff804a4cbf in db_trap (type=<optimized out>, code=<optimized
> > out>)
> >     at /usr/src/sys/ddb/db_main.c:252
> > #6  0xffffffff80c1e55c in kdb_trap (type=3, code=0, tf=<optimized out>)
> >     at /usr/src/sys/kern/subr_kdb.c:692
> > #7  0xffffffff811957df in trap (frame=0xfffffe00907e8d20)
> >     at /usr/src/sys/amd64/amd64/trap.c:621
> > #8  <signal handler called>
> >
> > Your guess about corrupted text was prescient.  Here is the disassembly
> > according to ddb:
> >
> https://people.freebsd.org/~asomers/Screenshot_fbsd-head_2019-09-26_13%3A51%3A34.png
> > And here is the disassembly of the same section according to gdb:
> >    0xffffffff8113b2e0 <sse42_crc32c+96>: mov    %rsi,%r9
> >    0xffffffff8113b2e3 <sse42_crc32c+99>: sub    $0xffffffffffffff80,%r9
> >    0xffffffff8113b2e7 <sse42_crc32c+103>: add    $0x100,%rsi
> >    0xffffffff8113b2ee <sse42_crc32c+110>: mov    %r11,%rbx
> >    0xffffffff8113b2f1 <sse42_crc32c+113>: xor    %eax,%eax
> >    0xffffffff8113b2f3 <sse42_crc32c+115>: xor    %r11d,%r11d
> >    0xffffffff8113b2f6 <sse42_crc32c+118>: nopw   %cs:0x0(%rax,%rax,1)
> >    0xffffffff8113b300 <sse42_crc32c+128>: mov    %rsi,%rdx
> >    0xffffffff8113b303 <sse42_crc32c+131>: mov    -0x100(%rsi),%rsi
> >    0xffffffff8113b30a <sse42_crc32c+138>: mov    -0x80(%rdx),%rdi
> >    0xffffffff8113b30e <sse42_crc32c+142>: crc32q %rsi,%rbx
> >    0xffffffff8113b314 <sse42_crc32c+148>: crc32q %rdi,%rax
> >    0xffffffff8113b31a <sse42_crc32c+154>: mov    (%rdx),%rsi
> >    0xffffffff8113b31d <sse42_crc32c+157>: crc32q %rsi,%r11
> >    0xffffffff8113b323 <sse42_crc32c+163>: lea    0x8(%rdx),%rsi
> >    0xffffffff8113b327 <sse42_crc32c+167>: add    $0xffffffffffffff08,%rdx
> >    0xffffffff8113b32e <sse42_crc32c+174>: cmp    %r9,%rdx
> >    0xffffffff8113b331 <sse42_crc32c+177>:
> >     jb     0xffffffff8113b300 <sse42_crc32c+128>
> >    0xffffffff8113b333 <sse42_crc32c+179>: movzbl %cl,%r9d
> >    0xffffffff8113b337 <sse42_crc32c+183>: movzbl %ch,%edi
> >    0xffffffff8113b33a <sse42_crc32c+186>: mov    %ecx,%edx
> >
> > Care to guess what's causing the corruption?
> > -Alan
> > _______________________________________________
> > freebsd-current_at_freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "
> freebsd-current-unsubscribe_at_freebsd.org"
>