On Thu, Jan 12, 2017 at 1:54 AM, Matthew Macy <mmacy_at_nextbsd.org> wrote: > > A flame graph for the core cycle count and a flame graph with cache > miss stats from pmc would be a great start. > > > > > > I didn't know the exact event name to use for cache miss stats, but > here are the flame graphs for CPU_CLK_UNHALTED_CORE: > > http://dev.bsdrp.net/netgate.r311848.CPU_CLK_UNHALTED_CORE.svg > > http://dev.bsdrp.net/netgate.r311849.CPU_CLK_UNHALTED_CORE.svg > > Thanks. Having twice as many txqs would definitely help. It's also clear > that there may be some sort of peformance issue in iflib_txq_drain. > Although it could just be non-stop cache misses on the packet headers. > > > Any news about the performance issue in iflib_txq_drain ? On a different hardware (PC Engine APU2), I've got -20% performance drop: x head r311848: packets per second + head r311849: packets per second +--------------------------------------------------------------------------+ | ++ x| |+++ x xx x| | |_A_|| ||A| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 580021 588650 585676 585406.1 3550.8673 + 5 463865 467599 465428 465638.6 1437.9347 Difference at 95.0% confidence -119768 +/- 3950.78 -20.4589% +/- 0.558328% (Student's t, pooled s = 2708.9) Because it's an AMD processor I didn't found the pmc equivalent of CPU_CLK_UNHALTED_CORE, then I've used BU_CPU_CLK_UNHALTED but I've no idea if it's the good one. http://dev.bsdrp.net/apu2.r311848.BU_CPU_CLK_UNHALTED.svg http://dev.bsdrp.net/apu2.r311849.BU_CPU_CLK_UNHALTED.svg ThanksReceived on Mon Jan 23 2017 - 14:40:12 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:09 UTC