Re: New optimized soreceive_stream() for TCP sockets, proof of concept

From: Andrew Gallatin <gallatin_at_cs.duke.edu>
Date: Mon, 5 Mar 2007 13:22:54 -0500 (EST)
Andre Oppermann writes:
 > The patch is here:
 > 
 >   http://people.freebsd.org/~andre/soreceive_stream-20070302.diff
 > 
 > Any testing, especially on 10Gig cards, and feedback appreciated.

I just tested with my standard mxge setup (details and data below).
This is *awesome*.  Before your patch, the peak bandwidth was 7.5Gb/s
with jumbo frames, and 2.47Gb/s with standard frames.  After the
patch, it is 9.0Gb/s and 2.92Gb/s.  Before the patch, the bandwidth
"collapses" above a certain receive socket buffer size.  Eg, it is
7.5Gb/s with a 128KB socket buffer, but 6.2Gb/s with 192KB or more.
Now the peak is with a 192KB socket buffer, and there is no sudden
collapse above the peak!

With the patch, we finally seem to be performance competative on the
receive side with Linux x86_64 and Solaris/amd64 on this same
hardware. Both of those OSes do much better (saturate the link with
jumbos) when CPU affinity is used to bind the interrupt handler and
netserver process to different cores on the same socket.  I imagine
FreeBSD may be able to do even better if it ever grows CPU affinity
support for both interrupt handlers and processes.  With the patch, it
performs at least as well, if not better than, Solaris and Linux do
without CPU affinity.

Venice: AMD Athlon(tm) 64 Processor 3000+, 1.8GHz, 512KB L2
Venice: Linux 2.6.19, UP
Rome: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+, 2.0GHz, 512KB L2
Rome: -current, SMP
Motherboard (both machines): DFI Corp,  LP NF4 Series
Chipset (both machines):    Nvidia CK804
NIC (both machines): Myri 10G-PCIE-8A-C (CX4)

The tests were run from venice -> rome using netperf242 -Hrome-my
-tTCP_SENDFILE -F /boot/vmlinuz-2.6.9-11.EL -C -c -- -S
$SOCKETBUFFER_SIZE

I took the median of 5 runs at each socket buffer size.
I really need to write some scripts..

MTU 1500
Pre:
 65536  65536  65536    10.00      2472.08   31.83    90.38    1.055   5.990 
 98304  65536  65536    10.00      2280.62   29.61    85.77    1.064   6.162 
131072  65536  65536    10.00      2190.13   28.23    84.67    1.056   6.334 
196608  65536  65536    10.01      2111.03   23.75    81.54    0.921   6.328 
262144  65536  65536    10.01      1843.48   15.87    73.46    0.705   6.529 
524288  65536  65536    10.01      1808.00   17.28    76.54    0.783   6.936 
1048576  65536  65536    10.01      1919.01   15.72    81.54    0.671   6.962 


Post:
 65536  65536  65536    10.00      2839.09   37.42    86.92    1.080   5.016 
 98304  65536  65536    10.00      2923.07   36.42    86.92    1.021   4.872 
131072  65536  65536    10.01      2694.91   30.74    86.92    0.934   5.285 
196608  65536  65536    10.00      2640.54   27.62    86.15    0.857   5.346 
262144  65536  65536    10.00      2422.32   22.23    82.69    0.752   5.593 
524288  65536  65536    10.01      2168.59   16.96    80.77    0.641   6.102 
1048576  65536  65536    10.01      2138.10   16.39    85.77    0.628   6.572 


MTU 9000
Pre:
 65536  65536  65536    10.00      4908.35   34.61    52.31    0.578   1.746 
 98304  65536  65536    10.00      6623.84   44.71    68.46    0.553   1.693 
131072  65536  65536    10.00      7504.46   41.83    77.69    0.457   1.696 
196608  65536  65536    10.00      6248.29   36.91    77.69    0.484   2.037 
262144  65536  65536    10.00      6230.94   35.51    76.92    0.467   2.023 
524288  65536  65536    10.00      6242.76   36.03    80.38    0.473   2.110 
1048576  65536  65536    10.00      6162.54   37.03    74.62    0.492   1.984 


Post:
 65536  65536  65536    10.00      4957.98   36.72    49.62    0.607   1.640 
 98304  65536  65536    10.00      6799.78   44.61    68.08    0.537   1.640 
131072  65536  65536    10.00      7998.88   46.92    73.08    0.481   1.497 
196608  65536  65536    10.00      9017.95   48.92    82.69    0.444   1.502 
262144  65536  65536    10.00      8874.62   49.52    81.15    0.457   1.498 
524288  65536  65536    10.00      8787.09   48.12    81.92    0.449   1.527 
1048576  65536  65536    10.00      8631.88   46.02    82.31    0.437   1.562 


Drew


PS: It didn't want to compile at first.  I needed to comment out
the offending 2 lines:

cc -c -O2 -frename-registers -pipe -fno-strict-aliasing  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -fformat-extensions -nostdinc -I-  -I. -I../../.. -I../../../contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror  ../../../kern/uipc_sockbuf.c
../../../kern/uipc_sockbuf.c: In function `sbpull_locked':
../../../kern/uipc_sockbuf.c:841: error: structure has no member named `sb_sndptroff'
../../../kern/uipc_sockbuf.c:842: error: structure has no member named `sb_sndptroff'
*** Error code 1

Stop in /usr/var/tmp/sys/amd64/compile/ROME.
Received on Mon Mar 05 2007 - 17:23:16 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:06 UTC