Re: 8.0-RC3 network performance regression

From: Elliot Finley <efinley.lists_at_gmail.com> Date: Thu, 19 Nov 2009 09:42:46 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC

On Thu, Nov 19, 2009 at 2:11 AM, Robert Watson <rwatson_at_freebsd.org> wrote:

>
> On Wed, 18 Nov 2009, Elliot Finley wrote:
>
>  I have several boxes running 8.0-RC3 with pretty dismal network
>> performance. I also have some 7.2 boxes with great performance. Using iperf
>> I did some tests:
>>
>> server(8.0) <- client (8.0) == 420Mbps
>> server(7.2) <- client (7.2) == 950Mbps
>> server(7.2) <- client (8.0) == 920Mbps
>> server(8.0) <- client (7.2) == 420Mbps
>>
>> so when the server is 7.2, I have good performance regardless of whether
>> the client is 8.0 or 7.2. when the server is 8.0, I have poor performance
>> regardless of whether the client is 8.0 or 7.2.
>>
>> Has anyone else noticed this?  Am I missing something simple?
>>
>
> I've generally not measured regressions along these lines, but TCP
> performance can be quite sensitive to specific driver version and hardware
> configuration. So far, I've generally measured significant TCP scalability
> improvements in 8, and moderate raw TCP performance improvements over real
> interfaces.  On the other hand, I've seen decreased TCP performance on the
> loopback due to scheduling interactions with ULE on some systems (but not
> all -- disabling checksum generate/verify has improved loopback on other
> systems).
>
> The first thing to establish is whether other similar benchmarks give the
> same result, which might us to narrow the issue down a bit.  Could you try
> using netperf+netserver with the TCP_STREAM test and see if that differs
> using the otherwise identical configuration?
>
> Could you compare the ifconfig link configuration of 7.2 and 8.0 to make
> sure there's not a problem with the driver negotiating, for example, half
> duplex instead of full duplex?  Also confirm that the same blend ot
> LRO/TSO/checksum offloading/etc is present.
>
> Could you do "procstat -at | grep ifname" (where ifname is your interface
> name) and send that to me?
>
> Another thing to keep an eye of is interrupt rates and pin sharing, which
> are both sensitive to driver change and ACPI changes.  It wouldn't hurt to
> compare vmstat -i rates not just on your network interface, but also on
> other devices, to make sure there's not new aliasing.  With a new USB stack
> and plenty of other changes, additional driver code running when your NIC
> interrupt fires would be highly measurable.
>
> Finally, two TCP tweaks to try:
>
> (1) Try disabling in-flight bandwidth estimation by setting
>    net.inet.tcp.inflight.enable to 0.  This often hurts low-latency,
>    high-bandwidth local ethernet links, and is sensitive to many other
> issues
>    including time-keeping.  It may not be the "cause", but it's a useful
>    thing to try.
>
> (2) Try setting net.inet.tcp.read_locking to 0, which disables the
> read-write
>    locking strategy on global TCP locks.  This setting, when enabled,
>    significantly impoves TCP scalability when dealing with multiple NICs or
>    input queues, but is one of the non-trivial functional changes in TCP.

Thanks for the reply.  Here is some more info:

netperf results:
storage-price-3 root:~#>netperf -H 10.20.10.20
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.20.10.20
(10.20.10.20) port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

4194304 4194304 4194304    10.04     460.10

The interface on both boxes is em1.  Both boxes (8.0RC3) have two 4-port
PCIe NICs in them.Trying the two TCP tweaks didn't change anything.  While
running iperf I did the procstat and vmstat:

SERVER:
storage-price-2 root:~#>ifconfig em1
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:15:17:b2:31:3d
        inet 10.20.10.20 netmask 0xffffff00 broadcast 10.20.10.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

storage-price-2 root:~#>procstat -at | grep em1
    0 100040 kernel           em1 taskq          3   16 run     -

storage-price-2 root:~#>vmstat -i
interrupt                          total       rate
irq14: ata0                        22979          0
irq15: ata1                        23157          0
irq16: aac0 uhci0*                  1552          0
irq17: uhci2+                         37          0
irq18: ehci0 uhci+                    43          0
cpu0: timer                    108455076       2000
irq257: em1                      2039287         37
cpu2: timer                    108446955       1999
cpu1: timer                    108447018       1999
cpu3: timer                    108447039       1999
cpu7: timer                    108447061       1999
cpu5: timer                    108447061       1999
cpu6: timer                    108447054       1999
cpu4: timer                    108447061       1999
Total                          869671380      16037

CLIENT:
storage-price-3 root:~#>ifconfig em1
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:15:17:b2:31:49
        inet 10.20.10.30 netmask 0xffffff00 broadcast 10.20.10.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

storage-price-3 root:~#>procstat -at | grep em1
    0 100040 kernel           em1 taskq          3   16 run     -

storage-price-3 root:~#>vmstat -i
interrupt                          total       rate
irq1: atkbd0                           2          0
irq14: ata0                        22501          0
irq15: ata1                        22395          0
irq16: aac0 uhci0*                  5091          0
irq17: uhci2+                        125          0
irq18: ehci0 uhci+                    43          0
cpu0: timer                    108421132       1999
irq257: em1                      1100465         20
cpu3: timer                    108412973       1999
cpu1: timer                    108412987       1999
cpu2: timer                    108413010       1999
cpu7: timer                    108413048       1999
cpu6: timer                    108413048       1999
cpu5: timer                    108413031       1999
cpu4: timer                    108413045       1999
Total                          868462896      16020

7.2 BOX:
dns1 root:~#>ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
        ether 00:13:72:5a:ff:48
        inet X.Y.Z.7 netmask 0xffffffc0 broadcast X.Y.Z.63
        media: Ethernet autoselect (1000baseTX <full-duplex>)
        status: active

The 8.0RC3 boxes are being used for testing right now (production 2nd week
of December).  If you want access to them, that wouldn't be a problem.

TIA
Elliot