On 07.12.11 22:23, Luigi Rizzo wrote: > > Sorry, forgot to mention that the above is with TSO DISABLED > (which is not the default). TSO seems to have a very bad > interaction with HWCSUM and non-zero mitigation. I have this on both sender and receiver # ifconfig ix1 ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LRO> ether 00:25:90:35:22:f1 inet 10.2.101.11 netmask 0xffffff00 broadcast 10.2.101.255 media: Ethernet autoselect (autoselect <full-duplex>) status: active without LRO on either end # nuttcp -t -T 5 -w 128 -v 10.2.101.11 nuttcp-t: v6.1.2: socket nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.11 nuttcp-t: time limit = 5.00 seconds nuttcp-t: connect to 10.2.101.11 with mss=1448, RTT=0.051 ms nuttcp-t: send window size = 131768, receive window size = 66608 nuttcp-t: 1802.4049 MB in 5.06 real seconds = 365077.76 KB/sec = 2990.7170 Mbps nuttcp-t: host-retrans = 0 nuttcp-t: 28839 I/O calls, msec/call = 0.18, calls/sec = 5704.44 nuttcp-t: 0.0user 4.5sys 0:05real 90% 108i+1459d 630maxrss 0+2pf 87706+1csw nuttcp-r: v6.1.2: socket nuttcp-r: buflen=65536, nstream=1, port=5001 tcp nuttcp-r: accept from 10.2.101.12 nuttcp-r: send window size = 33304, receive window size = 131768 nuttcp-r: 1802.4049 MB in 5.18 real seconds = 356247.49 KB/sec = 2918.3794 Mbps nuttcp-r: 529295 I/O calls, msec/call = 0.01, calls/sec = 102163.86 nuttcp-r: 0.1user 3.7sys 0:05real 73% 116i+1567d 618maxrss 0+15pf 230404+0csw with LRO on receiver # nuttcp -t -T 5 -w 128 -v 10.2.101.11 nuttcp-t: v6.1.2: socket nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.11 nuttcp-t: time limit = 5.00 seconds nuttcp-t: connect to 10.2.101.11 with mss=1448, RTT=0.067 ms nuttcp-t: send window size = 131768, receive window size = 66608 nuttcp-t: 2420.5000 MB in 5.02 real seconds = 493701.04 KB/sec = 4044.3989 Mbps nuttcp-t: host-retrans = 2 nuttcp-t: 38728 I/O calls, msec/call = 0.13, calls/sec = 7714.08 nuttcp-t: 0.0user 4.1sys 0:05real 83% 107i+1436d 630maxrss 0+2pf 4896+0csw nuttcp-r: v6.1.2: socket nuttcp-r: buflen=65536, nstream=1, port=5001 tcp nuttcp-r: accept from 10.2.101.12 nuttcp-r: send window size = 33304, receive window size = 131768 nuttcp-r: 2420.5000 MB in 5.15 real seconds = 481679.37 KB/sec = 3945.9174 Mbps nuttcp-r: 242266 I/O calls, msec/call = 0.02, calls/sec = 47080.98 nuttcp-r: 0.0user 2.4sys 0:05real 49% 112i+1502d 618maxrss 0+15pf 156333+0csw About 1/4 improvement... With LRO on both sender and receiver # nuttcp -t -T 5 -w 128 -v 10.2.101.11 nuttcp-t: v6.1.2: socket nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.11 nuttcp-t: time limit = 5.00 seconds nuttcp-t: connect to 10.2.101.11 with mss=1448, RTT=0.049 ms nuttcp-t: send window size = 131768, receive window size = 66608 nuttcp-t: 2585.7500 MB in 5.02 real seconds = 527402.83 KB/sec = 4320.4840 Mbps nuttcp-t: host-retrans = 1 nuttcp-t: 41372 I/O calls, msec/call = 0.12, calls/sec = 8240.67 nuttcp-t: 0.0user 4.6sys 0:05real 93% 106i+1421d 630maxrss 0+2pf 4286+0csw nuttcp-r: v6.1.2: socket nuttcp-r: buflen=65536, nstream=1, port=5001 tcp nuttcp-r: accept from 10.2.101.12 nuttcp-r: send window size = 33304, receive window size = 131768 nuttcp-r: 2585.7500 MB in 5.15 real seconds = 514585.31 KB/sec = 4215.4829 Mbps nuttcp-r: 282820 I/O calls, msec/call = 0.02, calls/sec = 54964.34 nuttcp-r: 0.0user 2.7sys 0:05real 55% 114i+1540d 618maxrss 0+15pf 188794+147csw Even better... With LRO on sender only: # nuttcp -t -T 5 -w 128 -v 10.2.101.11 nuttcp-t: v6.1.2: socket nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.11 nuttcp-t: time limit = 5.00 seconds nuttcp-t: connect to 10.2.101.11 with mss=1448, RTT=0.054 ms nuttcp-t: send window size = 131768, receive window size = 66608 nuttcp-t: 2077.5437 MB in 5.02 real seconds = 423740.81 KB/sec = 3471.2847 Mbps nuttcp-t: host-retrans = 0 nuttcp-t: 33241 I/O calls, msec/call = 0.15, calls/sec = 6621.01 nuttcp-t: 0.0user 4.5sys 0:05real 92% 109i+1468d 630maxrss 0+2pf 49532+25csw nuttcp-r: v6.1.2: socket nuttcp-r: buflen=65536, nstream=1, port=5001 tcp nuttcp-r: accept from 10.2.101.12 nuttcp-r: send window size = 33304, receive window size = 131768 nuttcp-r: 2077.5437 MB in 5.15 real seconds = 413415.33 KB/sec = 3386.6984 Mbps nuttcp-r: 531979 I/O calls, msec/call = 0.01, calls/sec = 103378.67 nuttcp-r: 0.0user 4.5sys 0:05real 88% 110i+1474d 618maxrss 0+15pf 117367+0csw > also remember that hw.ixgbe.max_interrupt_rate has only > effect at module load -- i.e. you set it with the bootloader, > or with kenv before loading the module. I have this in /boot/loader.conf kern.ipc.nmbclusters=512000 hw.ixgbe.max_interrupt_rate=0 on both sender and receiver. > Please retry the measurements disabling tso (on both sides, but > it really matters only on the sender). Also, LRO requires HWCSUM. How do I set HWCSUM? Is this different from RXCSUM/TXCSUM? Still I get nowhere near what you get on my hardware... Here is what pciconf -vlbc has to say ix0_at_pci0:3:0:0: class=0x020000 card=0xffffffff chip=0x10fc8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xfbc00000, size 2097152, enabled bar [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled bar [20] = type Memory, range 64, base 0xfbbfc000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8) cap 03[e0] = VPD ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 002590ffff363f80 ecap 000e[150] = unknown 1 ecap 0010[160] = unknown 1 ix1_at_pci0:3:0:1: class=0x020000 card=0xffffffff chip=0x10fc8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xfb800000, size 2097152, enabled bar [18] = type I/O Port, range 32, base 0xd880, size 32, enabled bar [20] = type Memory, range 64, base 0xfbbf8000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8) cap 03[e0] = VPD ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 002590ffff363f80 ecap 000e[150] = unknown 1 ecap 0010[160] = unknown 1 I am using ix1, as the blade enclosure has only one 10G switch and it happens to be on the 'second' position. DanielReceived on Thu Dec 08 2011 - 09:06:42 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:21 UTC