Re: Strange things on GBit / 1000->100 / net.inet.tcp.inflight.*

From: Andre Oppermann <andre_at_freebsd.org>
Date: Fri, 17 Sep 2004 14:48:21 +0200
"Raphael H. Becker" wrote:
> 
> Hi *,
> 
> one of our subnets is on a GBit-Switch since last week.
> The nodes on the subnet are:
> 
> 2x Dell PE350,  RELENG_4_10, fxp{0,1}, 100baseTX <full-duplex>
> 3x Dell PE2650, RELENG_5 (BETA4), bge0, 1000baseTX <full-duplex>
> 1x Dell PE2650, RELENG_4_10, bge1, 1000baseTX <full-duplex>
> 
> The switch is a "NETGEAR Model GS516T Copper Gigabit Switch" [1]
> 
> To test transfer und throughput every system has a running ftpd and a
> 1GByte-file in /pub/1GB or a 250M file for the small boxes.
> 
> Every system is able to send and receive data with full speed
> (>10.5MBytes/sec on 100MBit, >70-90MBytes/sec(!) on GBit)
> 
> I use wget for testing:
> wget -O - --proxy=off ftp://10.101.240.52/pub/1GB >/dev/null
> 
> The 3 5.x-Boxes on GBit transfer up to ~93MBytes(!) per second to
> each other (serving the file from cache, 2 parallel sessions).
> 
> The two PE350 boxes transfer data with >10MBytes/sec to each other.
> 
> FTP from a 5.3 (PE2650,GBit) to 4.10 (PE350,100MBit) fails, throughput
> around 200kBytes to 750kBytes/sec !!
> 
> Same two hosts, ftp in other direction (100->1000) is running
> 10.5MBytes/sec.
> 
> I tested with another PE2650, running 4.10-RELEASE, ftp 1000->100 works
> fine, >10MBytes/sec, stable!!
> 
> The difference must be the OS, the hardware is more or less the same
> 
> the 4.10-BOX:
> bge1: <Broadcom BCM5701 Gigabit Ethernet, ASIC rev. 0x105> mem 0xfcd00000-0xfcd0ffff irq 17 at device 8.0 on pci3
> bge1: Ethernet address: 00:06:5b:f7:f9:00
> miibus1: <MII bus> on bge1
> 
> one of the the 5.3-Boxes:
> bge0: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xfcf10000-0xfcf1ffff irq 28 at device 6.0 on pci3
> miibus0: <MII bus> on bge0
> bge0: Ethernet address: 00:0d:56:bb:9c:25
> 
> My guess: The 5.3-Boxes send bigger TCP-Windows than our switch has
> buffer for each port resulting in massive packetloss or something like
> that. The sender is "too fast" for the switch or the switch isn't able
> to convert from 1000MBit to 100MBit under heavy load
> (store&forward-buffer)

Could you send me the output of (after you have run the 1000->100 test):

 # sysctl net.inet.tcp
 # sysctl net.inet.tcp.hostcache.list
 # netstat -s -p tcp
 # netstat -s -p ip

> I fiddled around with net.inet.tcp.inflight.max. A rebooted system
> has a value of "net.inet.tcp.inflight.max: 1073725440", i trimmed that
> down in steps, testing and searching for effects.
> 
> A value < ~75000 for ~.max limits the throughput 1000->1000 MBit
> The transfer 1000->100MBit works for values <11583 (around 7MByte/sec),
> >=11584 the throughput cuts, about 200kByte/sec.

Fiddling with the inflight.max values doesn't help in this case.  Those
don't need any tuning.  What could make a difference is to disable
inflight entirely.  However I'd like to get the output of the stuff
above first.

> A max throughput 1000->100MBit is for a value ~.max around 7800-8200.
> With this value the GBit-to-GBit transfer is around 18.5MBytes/sec and
> 20MBytes/sec.
> 
> Using the "edge" of ~.max=11583 the GBit-to-GBit transfer is at 31MBytes/sec.
> 
> I have no idea what is wrong or broken. Maybe the switch (too small buffer)
> or the "inflight bandwith delay"-algorithm or something else. I guess ther's
> no physical problem with cables or connectors or ports on the switch
> (1000MBit works great for 1000MBit only).
> 
> I'm willing to test patches or other cases as long as I don't need to
> change hardware.
> 
> Need more detailed info on a subject?
> Any idea? Tuning? Patches? Pointers?

One step after the other.  I'm sure we will find the problem.

-- 
Andre
Received on Fri Sep 17 2004 - 10:48:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:12 UTC