"Raphael H. Becker" wrote: > > Hi *, > > one of our subnets is on a GBit-Switch since last week. > The nodes on the subnet are: > > 2x Dell PE350, RELENG_4_10, fxp{0,1}, 100baseTX <full-duplex> > 3x Dell PE2650, RELENG_5 (BETA4), bge0, 1000baseTX <full-duplex> > 1x Dell PE2650, RELENG_4_10, bge1, 1000baseTX <full-duplex> > > The switch is a "NETGEAR Model GS516T Copper Gigabit Switch" [1] > > To test transfer und throughput every system has a running ftpd and a > 1GByte-file in /pub/1GB or a 250M file for the small boxes. > > Every system is able to send and receive data with full speed > (>10.5MBytes/sec on 100MBit, >70-90MBytes/sec(!) on GBit) > > I use wget for testing: > wget -O - --proxy=off ftp://10.101.240.52/pub/1GB >/dev/null > > The 3 5.x-Boxes on GBit transfer up to ~93MBytes(!) per second to > each other (serving the file from cache, 2 parallel sessions). > > The two PE350 boxes transfer data with >10MBytes/sec to each other. > > FTP from a 5.3 (PE2650,GBit) to 4.10 (PE350,100MBit) fails, throughput > around 200kBytes to 750kBytes/sec !! > > Same two hosts, ftp in other direction (100->1000) is running > 10.5MBytes/sec. > > I tested with another PE2650, running 4.10-RELEASE, ftp 1000->100 works > fine, >10MBytes/sec, stable!! > > The difference must be the OS, the hardware is more or less the same > > the 4.10-BOX: > bge1: <Broadcom BCM5701 Gigabit Ethernet, ASIC rev. 0x105> mem 0xfcd00000-0xfcd0ffff irq 17 at device 8.0 on pci3 > bge1: Ethernet address: 00:06:5b:f7:f9:00 > miibus1: <MII bus> on bge1 > > one of the the 5.3-Boxes: > bge0: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xfcf10000-0xfcf1ffff irq 28 at device 6.0 on pci3 > miibus0: <MII bus> on bge0 > bge0: Ethernet address: 00:0d:56:bb:9c:25 > > My guess: The 5.3-Boxes send bigger TCP-Windows than our switch has > buffer for each port resulting in massive packetloss or something like > that. The sender is "too fast" for the switch or the switch isn't able > to convert from 1000MBit to 100MBit under heavy load > (store&forward-buffer) Could you send me the output of (after you have run the 1000->100 test): # sysctl net.inet.tcp # sysctl net.inet.tcp.hostcache.list # netstat -s -p tcp # netstat -s -p ip > I fiddled around with net.inet.tcp.inflight.max. A rebooted system > has a value of "net.inet.tcp.inflight.max: 1073725440", i trimmed that > down in steps, testing and searching for effects. > > A value < ~75000 for ~.max limits the throughput 1000->1000 MBit > The transfer 1000->100MBit works for values <11583 (around 7MByte/sec), > >=11584 the throughput cuts, about 200kByte/sec. Fiddling with the inflight.max values doesn't help in this case. Those don't need any tuning. What could make a difference is to disable inflight entirely. However I'd like to get the output of the stuff above first. > A max throughput 1000->100MBit is for a value ~.max around 7800-8200. > With this value the GBit-to-GBit transfer is around 18.5MBytes/sec and > 20MBytes/sec. > > Using the "edge" of ~.max=11583 the GBit-to-GBit transfer is at 31MBytes/sec. > > I have no idea what is wrong or broken. Maybe the switch (too small buffer) > or the "inflight bandwith delay"-algorithm or something else. I guess ther's > no physical problem with cables or connectors or ports on the switch > (1000MBit works great for 1000MBit only). > > I'm willing to test patches or other cases as long as I don't need to > change hardware. > > Need more detailed info on a subject? > Any idea? Tuning? Patches? Pointers? One step after the other. I'm sure we will find the problem. -- AndreReceived on Fri Sep 17 2004 - 10:48:14 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:12 UTC