Re: CFT: if_bridge performance improvements

From: Xin Li <delphij_at_delphij.net>
Date: Wed, 22 Apr 2020 01:20:57 -0700
Hi,

On 4/14/20 02:51, Kristof Provost wrote:
> Hi,
> 
> Thanks to support from The FreeBSD Foundation I’ve been able to work on
> improving the throughput of if_bridge.
> It changes the (data path) locking to use the NET_EPOCH infrastructure.
> Benchmarking shows substantial improvements (x5 in test setups).
> 
> This work is ready for wider testing now.
> 
> It’s under review here: https://reviews.freebsd.org/D24250
> 
> Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
> Patches for stable/12: https://people.freebsd.org/~kp/if_bridge/stable_12/
> 
> I’m not currently aware of any panics or issues resulting from these
> patches.

I have observed the following panic with latest stable/12 after applying
the stable_12 patchset, it appears like a race condition related NULL
pointer deference, but I haven't took a deeper look yet.

The box have 7 igb(4) NICs, with several bridge and VLAN configured
acting as a router.  Please let me know if you need additional
information; I can try -CURRENT as well, but it would take some time as
the box is relatively slow (it's a ZFS based system so I can create a
separate boot environment for -CURRENT if needed, but that would take
some time as I might have to upgrade the packages, should there be any
ABI breakages).

===

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x20
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c286d5
stack pointer           = 0x28:0xffffffff824cb840
frame pointer           = 0x28:0xffffffff824cb850
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 (if_io_tqg_0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1587541913
KDB: stack backtrace:
#0 0xffffffff80c117a5 at kdb_backtrace+0x65
#1 0xffffffff80bc588e at vpanic+0x17e
#2 0xffffffff80bc5703 at panic+0x43
#3 0xffffffff810d2310 at trap_pfault+0
#4 0xffffffff810d235f at trap_pfault+0x4f
#5 0xffffffff810d19b8 at trap+0x288
#6 0xffffffff810aae1c at calltrap+0x8
#7 0xffffffff80ba5c96 at __mtx_unlock_sleep+0xb6
#8 0xffffffff8248f4c7 at bridge_input+0x877
#9 0xffffffff80cd5c47 at ether_nh_input+0x207
#10 0xffffffff80cf1e4a at netisr_dispatch_src+0xca
#11 0xffffffff80cd4f0b at ether_input+0x4b
#12 0xffffffff80cdf1a3 at vlan_input+0x1f3
#13 0xffffffff80cd4ae1 at ether_demux+0x121
#14 0xffffffff80cd5d7b at ether_nh_input+0x33b
#15 0xffffffff80cf1e4a at netisr_dispatch_src+0xca
#16 0xffffffff80cd4f0b at ether_input+0x4b
#17 0xffffffff80cee41c at iflib_rxeof+0xadc
Uptime: 6m6s
Dumping 848 out of 16313
MB:..2%..12%..21%..31%..42%..51%..61%..72%..82%..91%


Backtrace:

(kgdb) #0  doadump () at src/sys/amd64/include/pcpu_aux.h:55
#1  0xffffffff80bc54a5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:451
#2  0xffffffff80bc58e6 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:880
#3  0xffffffff80bc5703 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:807
#4  0xffffffff810d2310 in trap_fatal (frame=<value optimized out>,
    eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:925
#5  0xffffffff810d235f in trap_pfault (frame=0xffffffff824cb780,
    usermode=<value optimized out>, signo=<value optimized out>,
    ucode=<value optimized out>) at src/sys/amd64/include/pcpu_aux.h:55
#6  0xffffffff810d19b8 in trap (frame=0xffffffff824cb780)
    at /usr/src/sys/amd64/amd64/trap.c:407
#7  0xffffffff810aae1c in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:289
#8  0xffffffff80c286d5 in turnstile_broadcast (ts=0x0, queue=0)
    at /usr/src/sys/kern/subr_turnstile.c:880
#9  0xffffffff80ba5c96 in __mtx_unlock_sleep (c=0xfffff80013351430, v=0)
    at /usr/src/sys/kern/kern_mutex.c:1041
#10 0xffffffff8248f4c7 in bridge_input (ifp=<value optimized out>,
    m=<value optimized out>) at src/sys/amd64/include/atomic.h:221
#11 0xffffffff80cd5c47 in ether_nh_input (m=<value optimized out>)
    at /usr/src/sys/net/if_ethersubr.c:631
#12 0xffffffff80cf1e4a in netisr_dispatch_src (proto=5,
    source=<value optimized out>, m=<value optimized out>)
    at /usr/src/sys/net/netisr.c:1124
#13 0xffffffff80cd4f0b in ether_input (ifp=0xfffff800060dc000, m=0x0)
    at /usr/src/sys/net/if_ethersubr.c:787
#14 0xffffffff80cdf1a3 in vlan_input (ifp=0xfffff800036d6800,
    m=0xfffff8001d65fc00) at /usr/src/sys/net/if_vlan.c:1291
#15 0xffffffff80cd4ae1 in ether_demux (ifp=0xfffff800036d6800,
    m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:832
#16 0xffffffff80cd5d7b in ether_nh_input (m=<value optimized out>)
    at /usr/src/sys/net/if_ethersubr.c:667
#17 0xffffffff80cf1e4a in netisr_dispatch_src (proto=5,
    source=<value optimized out>, m=<value optimized out>)
    at /usr/src/sys/net/netisr.c:1124
#18 0xffffffff80cd4f0b in ether_input (ifp=0xfffff800036d6800,
    m=0xfffff80013939c00) at /usr/src/sys/net/if_ethersubr.c:787
#19 0xffffffff80cee41c in iflib_rxeof (rxq=<value optimized out>,
    budget=<value optimized out>) at /usr/src/sys/net/iflib.c:2873
#20 0xffffffff80ce87b3 in _task_fn_rx (context=0xfffff800036d6000)
    at /usr/src/sys/net/iflib.c:3801
#21 0xffffffff80c100b1 in gtaskqueue_run_locked (queue=0xfffff8000306b900)
    at /usr/src/sys/kern/subr_gtaskqueue.c:363
#22 0xffffffff80c0fd53 in gtaskqueue_thread_loop (arg=<value optimized out>)
    at /usr/src/sys/kern/subr_gtaskqueue.c:538
#23 0xffffffff80b86b0e in fork_exit (
    callout=0xffffffff80c0fc80 <gtaskqueue_thread_loop>,
    arg=0xfffffe00003f4008, frame=0xffffffff824cbd40)
    at /usr/src/sys/kern/kern_fork.c:1079
#24 0xffffffff810abe6e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:1079
#25 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal


Received on Wed Apr 22 2020 - 06:21:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:23 UTC