panic after ifioctl/if_clone_destroy

From: Roman Bogorodskiy <novel_at_freebsd.org>
Date: Sun, 5 Aug 2018 19:35:57 +0400
Running -CURRENT r336863 on amd64. Get the following panic right after
(or during) boot:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 04
fault virtual address   = 0xdeadc2ff
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80bd7858
stack pointer           = 0x28:0xfffffe008b445580
frame pointer           = 0x28:0xfffffe008b4455c0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 903 (libvirtd)

Traceback is:

(kgdb) #0  doadump (textdump=0) at pcpu.h:230
#1  0xffffffff8043dc7b in db_dump (dummy=<value optimized out>,
    dummy2=<value optimized out>, dummy3=<value optimized out>,
    dummy4=<value optimized out>) at /usr/src/sys/ddb/db_command.c:574
#2  0xffffffff8043da49 in db_command (cmd_table=<value optimized out>)
    at /usr/src/sys/ddb/db_command.c:481
#3  0xffffffff8043d7c4 in db_command_loop ()
    at /usr/src/sys/ddb/db_command.c:534
#4  0xffffffff804409ef in db_trap (type=<value optimized out>,
    code=<value optimized out>) at /usr/src/sys/ddb/db_main.c:252
#5  0xffffffff80bdd513 in kdb_trap (type=12, code=0, tf=<value optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:693
#6  0xffffffff810769f1 in trap_fatal (frame=0xfffffe008b4454c0, eva=3735929599)
    at /usr/src/sys/amd64/amd64/trap.c:884
#7  0xffffffff81076b12 in trap_pfault (frame=0xfffffe008b4454c0,
    usermode=<value optimized out>) at pcpu.h:230
#8  0xffffffff8107611a in trap (frame=0xfffffe008b4454c0)
    at /usr/src/sys/amd64/amd64/trap.c:427
#9  0xffffffff810518ac in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:230
#10 0xffffffff80bd7858 in epoch_block_handler_preempt (
    global=<value optimized out>, cr=0xfffffe00760c3a00,
    arg=<value optimized out>) at /usr/src/sys/kern/subr_epoch.c:256
#11 0xffffffff803994fd in ck_epoch_synchronize_wait (
    global=0xfffff800030c5680,
    cb=0xffffffff80bd77a0 <epoch_block_handler_preempt>, ct=0x0)
    at /usr/src/sys/contrib/ck/src/ck_epoch.c:407
#12 0xffffffff80bd7630 in epoch_wait_preempt (epoch=0xfffff800030c5680)
    at /usr/src/sys/kern/subr_epoch.c:389
#13 0xffffffff80c983bf in if_delgroup (ifp=0xfffff80003aab800,
    groupname=0xfffff80005ff5e00 "bridge") at /usr/src/sys/net/if.c:1514
#14 0xffffffff80c9f2b2 in if_clone_destroyif (ifc=0xfffff80005ff5e00,
    ifp=0xfffff80003aab800) at /usr/src/sys/net/if_clone.c:325
#15 0xffffffff80c9f0d5 in if_clone_destroy (name=0xfffffe008b4458d0 "virbr0")
    at /usr/src/sys/net/if_clone.c:288
#16 0xffffffff80c9a2c3 in ifioctl (so=0xfffff80007edca38, cmd=2149607801,
    data=<value optimized out>, td=<value optimized out>)
    at /usr/src/sys/net/if.c:3053
#17 0xffffffff80c04259 in kern_ioctl (td=0xfffff80007c1a580,
    fd=<value optimized out>, com=<value optimized out>,
    data=<value optimized out>) at file.h:330
#18 0xffffffff80c03f2e in sys_ioctl (td=0xfffff80007c1a580,
    uap=0xfffff80007c1a940) at /usr/src/sys/kern/sys_generic.c:712
#19 0xffffffff81077401 in amd64_syscall (td=0xfffff80007c1a580, traced=0)
    at subr_syscall.c:135
#20 0xffffffff8105218d in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:500
#21 0x00000008028f4c0a in ?? ()                                                                                                                                                                                    
Previous frame inner to this frame (corrupt stack?)                                                                                                                                                                
Current language:  auto; currently minimal                                                                                                                                                                         
(kgdb)

It looks like panic happens during network interfaces related
operations. Couple of dmesg lines before panic:

Aug  5 19:02:42 romashka rtsold[585]: <rtsock_input_ifannounce> interface bridge0 removed
Aug  5 19:02:42 romashka kernel: bridge0: Ethernet address: 02:af:41:48:c7:00
Aug  5 19:02:42 romashka kernel: bridge0: changing name to 'virbr-ab'
Aug  5 19:02:42 romashka kernel: tap0: Ethernet address: 00:bd:8d:11:f7:00
Aug  5 19:02:42 romashka kernel: tap0: link state changed to UP
Aug  5 19:02:42 romashka kernel: tap0: changing name to 'virbr-ab-nic'
Aug  5 19:02:42 romashka kernel: virbr-ab-nic: promiscuous mode enabled
Aug  5 19:02:42 romashka kernel: virbr-ab: link state changed to UP
Aug  5 19:02:42 romashka rtsold[585]: <rtsock_input_ifannounce> interface tap0 removed
Aug  5 19:02:43 romashka dnsmasq[1047]: setting --bind-interfaces option because of OS limitations
Aug  5 19:02:43 romashka dnsmasq[1047]: warning: no upstream servers configured
Aug  5 19:02:43 romashka kernel: virbr-ab-nic: link state changed to DOWN
Aug  5 19:02:43 romashka kernel: virbr-ab: link state changed to DOWN
Aug  5 19:02:43 romashka kernel: bridge1: Ethernet address: 02:af:41:48:c7:01
Aug  5 19:02:43 romashka kernel: bridge1: changing name to 'virbr0'
Aug  5 19:02:43 romashka rtsold[585]: <rtsock_input_ifannounce> interface bridge1 removed
Aug  5 19:02:43 romashka kernel: tap1: Ethernet address: 00:bd:53:14:f7:01
Aug  5 19:02:43 romashka kernel: tap1: link state changed to UP
Aug  5 19:02:43 romashka kernel: tap1: changing name to 'virbr0-nic'
Aug  5 19:02:43 romashka kernel: virbr0: link state changed to UP
Aug  5 19:02:43 romashka kernel: virbr0-nic: promiscuous mode enabled
Aug  5 19:02:43 romashka rtsold[585]: <rtsock_input_ifannounce> interface tap1 removed
Aug  5 19:05:03 romashka syslogd: kernel boot file is /boot/kernel/kernel
Aug  5 19:05:03 romashka kernel:
Aug  5 19:05:03 romashka syslogd: last message repeated 1 times
Aug  5 19:05:03 romashka kernel: Fatal trap 12: page fault while in kernel mode

If I disable libvirt service, system completes booting fine. What it
tries to do on start, it creates a couple of bridge(4) and tap(4)
devices, adds tap devices to bridges it created, and possibly destroy
these interfaces in case of errors. It also starts dnsmasq on some of
these interfaces.

This problem started to appear about 2-4 weeks ago.

Roman Bogorodskiy

Received on Sun Aug 05 2018 - 13:36:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC