Re: New optimized soreceive_stream() for TCP sockets, proof of concept

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Mon, 5 Mar 2007 18:29:56 +0000 (GMT)
On Mon, 5 Mar 2007, Andrew Gallatin wrote:

> With the patch, we finally seem to be performance competative on the receive 
> side with Linux x86_64 and Solaris/amd64 on this same hardware. Both of 
> those OSes do much better (saturate the link with jumbos) when CPU affinity 
> is used to bind the interrupt handler and netserver process to different 
> cores on the same socket.  I imagine FreeBSD may be able to do even better 
> if it ever grows CPU affinity support for both interrupt handlers and 
> processes.  With the patch, it performs at least as well, if not better 
> than, Solaris and Linux do without CPU affinity.

I don't have numbers in front of me, and am currently packing for a trip to 
Tokyo so won't find them before traveling, but my experience has been that 
binding the ithread to a specific CPU is very helpful in improving receive 
performance.  You can slap a sched_bind(0) into the interrupt handler the 
first time it runs and it should stick appropriately, and add a sysctl to 
sched_bind() for a user process as a hack to test it out.

John has a patch that pins interrupt threads, etc, not sure what the status of 
that is.  CC'd.

Robert N M Watson
Computer Laboratory
University of Cambridge

>
> Venice: AMD Athlon(tm) 64 Processor 3000+, 1.8GHz, 512KB L2
> Venice: Linux 2.6.19, UP
> Rome: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+, 2.0GHz, 512KB L2
> Rome: -current, SMP
> Motherboard (both machines): DFI Corp,  LP NF4 Series
> Chipset (both machines):    Nvidia CK804
> NIC (both machines): Myri 10G-PCIE-8A-C (CX4)
>
> The tests were run from venice -> rome using netperf242 -Hrome-my
> -tTCP_SENDFILE -F /boot/vmlinuz-2.6.9-11.EL -C -c -- -S
> $SOCKETBUFFER_SIZE
>
> I took the median of 5 runs at each socket buffer size.
> I really need to write some scripts..
>
> MTU 1500
> Pre:
> 65536  65536  65536    10.00      2472.08   31.83    90.38    1.055   5.990
> 98304  65536  65536    10.00      2280.62   29.61    85.77    1.064   6.162
> 131072  65536  65536    10.00      2190.13   28.23    84.67    1.056   6.334
> 196608  65536  65536    10.01      2111.03   23.75    81.54    0.921   6.328
> 262144  65536  65536    10.01      1843.48   15.87    73.46    0.705   6.529
> 524288  65536  65536    10.01      1808.00   17.28    76.54    0.783   6.936
> 1048576  65536  65536    10.01      1919.01   15.72    81.54    0.671   6.962
>
>
> Post:
> 65536  65536  65536    10.00      2839.09   37.42    86.92    1.080   5.016
> 98304  65536  65536    10.00      2923.07   36.42    86.92    1.021   4.872
> 131072  65536  65536    10.01      2694.91   30.74    86.92    0.934   5.285
> 196608  65536  65536    10.00      2640.54   27.62    86.15    0.857   5.346
> 262144  65536  65536    10.00      2422.32   22.23    82.69    0.752   5.593
> 524288  65536  65536    10.01      2168.59   16.96    80.77    0.641   6.102
> 1048576  65536  65536    10.01      2138.10   16.39    85.77    0.628   6.572
>
>
> MTU 9000
> Pre:
> 65536  65536  65536    10.00      4908.35   34.61    52.31    0.578   1.746
> 98304  65536  65536    10.00      6623.84   44.71    68.46    0.553   1.693
> 131072  65536  65536    10.00      7504.46   41.83    77.69    0.457   1.696
> 196608  65536  65536    10.00      6248.29   36.91    77.69    0.484   2.037
> 262144  65536  65536    10.00      6230.94   35.51    76.92    0.467   2.023
> 524288  65536  65536    10.00      6242.76   36.03    80.38    0.473   2.110
> 1048576  65536  65536    10.00      6162.54   37.03    74.62    0.492   1.984
>
>
> Post:
> 65536  65536  65536    10.00      4957.98   36.72    49.62    0.607   1.640
> 98304  65536  65536    10.00      6799.78   44.61    68.08    0.537   1.640
> 131072  65536  65536    10.00      7998.88   46.92    73.08    0.481   1.497
> 196608  65536  65536    10.00      9017.95   48.92    82.69    0.444   1.502
> 262144  65536  65536    10.00      8874.62   49.52    81.15    0.457   1.498
> 524288  65536  65536    10.00      8787.09   48.12    81.92    0.449   1.527
> 1048576  65536  65536    10.00      8631.88   46.02    82.31    0.437   1.562
>
>
> Drew
>
>
> PS: It didn't want to compile at first.  I needed to comment out
> the offending 2 lines:
>
> cc -c -O2 -frename-registers -pipe -fno-strict-aliasing  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -fformat-extensions -nostdinc -I-  -I. -I../../.. -I../../../contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror  ../../../kern/uipc_sockbuf.c
> ../../../kern/uipc_sockbuf.c: In function `sbpull_locked':
> ../../../kern/uipc_sockbuf.c:841: error: structure has no member named `sb_sndptroff'
> ../../../kern/uipc_sockbuf.c:842: error: structure has no member named `sb_sndptroff'
> *** Error code 1
>
> Stop in /usr/var/tmp/sys/amd64/compile/ROME.
>
>
Received on Mon Mar 05 2007 - 17:29:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:06 UTC