Re: slow USB 3.0 on -current

From: John-Mark Gurney <jmg_at_funkthat.com>
Date: Mon, 13 Jul 2020 12:03:49 -0700
Mark Millard wrote this message on Mon, Jul 13, 2020 at 00:44 -0700:
> On 2020-Jul-12, at 21:51, John-Mark Gurney <jmg at funkthat.com> wrote:
> 
> > Mark Millard wrote this message on Sun, Jul 12, 2020 at 18:26 -0700:
> >> John-Mark Gurney jmg at funkthat.com wrote on
> >> Sat Jul 11 22:44:36 UTC 2020 :
> >> 
> >>> I'm having issues getting good ethernet performance from a USB ethernet
> >>> adapter (ure) under FreeBSD on an HP EliteDesk 705 G2 Mini[1].  It's an
> >>> AMD PRO A10-8700B based system using the AMD A78 FCH chipset.
> >>> 
> >>> Under FreeBSD -current (r362596), 12.1-R and 11.4-R, the RealTek USB
> >>> adapter only gets around 10MB/sec performance.  During the transfer,
> >>> the CPU usage is only around 3-5%, so it's definitely not CPU bound.
> >>> 
> >>> I have tested Windows 10 and NetBSD 9.0 performance, and both provide
> >>> 100MB/sec+ w/o troubles.
> >>> 
> >>> I have attached dmesg from both FreeBSD -current and NetBSD 9.0.
> >>> 
> >>> Any hints on how to fix this?
> >>> 
> >>> This may be related, but I'm also having issues w/ booting when I have
> >>> both a SD USB 2.0 card reader AND the ure plugged into USB 3.0 ports.
> >>> 
> >>> If I move the SD card reader to USB 2.0, the umass device will attach
> >>> and work.  I have also attached a clip of the dmesg from that
> >>> happening.
> >>> 
> >>> Has anyone else seen this issue?  Ideas or thoughts on how to resolve
> >>> the performance issues?
> >> 
> >> It might prove useful to use iperf3 with
> >> 
> >> # iperf3 -s
> >> 
> >> on one machine and doing
> >> 
> >> # iperf3 -c ADDR
> >> . . .
> >> # iperf3 -R -c ADDR
> >> . . .
> >> 
> >> on the other. (That last swaps the
> >> sender/receiver status.)
> >> 
> >> All 3 commands will have output. The
> >> -s one will produce output for each of
> >> the -c ones.
> >> 
> >> The outputs for the sender(s) will include Cwnd
> >> (congestion window size) information that may
> >> be relevant. It will report bit rate and
> >> retry count sampling (and overall figures).

[...]

> The "iperf3 -s" should have had output with the Cwnd
> figures for the "Reverse mode" case above (and the
> distribution for the 1381 Retr total). They might
> not match when the earlier figures that you did report
> for the non-Reverse mode.

If you can tell how the Cwnd figures would help you figure out how to
make the USB3 bus run faster, I'll spend the time to retest and give
you the numbers, but I don't see how they can...

[...]

> > As you can see, both match approximately what I measured other methods,
> > so, it's definitely not the way I measured performance.
> > 
> >> My observation would be that neither type
> >> of USB3 Ethernet adapter that I've tried
> > 
> > What is the chipset that you tried?  One of the earlier ones that I
> > tried was an axe iirc, and was  limited to around 500Mbps or so...
> 
> Hmm. I only seem to be able to find one type. Its been a
> while since I've used the other and I do not know where
> it is at. For what I found:
> 
> ugen0.2: <ASIX Elec. AX88179> at usbus0
> axge0 on uhub0
> axge0: <NetworkInterface> on usbus0
> miibus1: <MII bus> on axge0
> rgephy0: <RTL8169S/8110S/8211 1000BASE-T media interface> PHY 3 on miibus1
> rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
> 
> (I have access to more than one instance of the above.)

Yeah, these are the ones that are known to not be able to get close
to gige speeds, unlike the RealTek one that I am using now...

I forgot that axge is the gigabit version of axe...

> The iperf3 output that I reported was for using
> of of the above. Note that when the USB3 EtherNet
> was reciveing Cwnd was reported as 29.8 KBytes
> or smaller for the example run, much like your
> output reporting 34.4 KBytes or less for the
> example run.

They grew to match the speeds that the link could do..

> This may suggest some common constraint across various
> USB3 EtherNet devices. The Cwnd figures are probably
> too small to get near 900 Mbit/s+.

As you can see, the NetBSD results was able to grow the
Cwnd large enough to obtain performance...

The stats that I provided were from the non-USB3 machine, and for
tx'ing to matches the issue I raised in the original post... I could
provide the Cwnd, but I don't see how that will debug a USB3 speed
issue...

The stats show that the Cwnd can grow on other OS's (NetBSD), and
on wired (bge) fine.

> But, even with a (smaller but) similar Cwnd figure
> my example was getting faster transfers than your
> example. I got smaller Retr figures as well. It
> leaves me wondering if there are packets being
> rejected in your context that are not in my
> context.

ping times to the machine via USB3 is higher than native gige, but
that isn't too surprising due to the extra latency introduced by them..
it's:
round-trip min/avg/max/stddev = 0.743/0.826/0.963/0.074 ms

Where as to a slower machine (PINE A64-LTS, arm64) with a
couple extra switches in between:
round-trip min/avg/max/stddev = 0.199/0.230/0.254/0.019 ms

> If what I reported would still be too slow, there
> may be two (or more) points to be addressed to
> get things going fast enough for you. You
> might be able to avoid one of the points by
> using a type of device that already does somewhat
> better. May be ask for the fastest examples
> folks have observed?
> 
> >> (different chipsets) get anywhere near
> >> 100 MByte/s when ifconfig reports
> >> 1000baseT <full-duplex>. The Cwnd figures
> >> are smaller than for the built-in Ethernets
> >> that manage much faster overall transfer
> >> rates.
> > 
> > [...]
> > 
> >> I'll note that between machines with built-in EtherNet
> >> that can sustain fast transfers overall, the Cwnd figures
> >> tend to vary but can reach 1 MBytes+. The Retr counts
> >> tend to still exist.
> >> 
> >> By contrast, when the USB3 EtherNet is receiving above,
> >> the maximum Cwnd reported above for the sender at the
> >> time was: 29.8 KBytes.
> >> 
> >> I have not tried NetBSD, Windows 10, or Linux comparisons.
> > 
> > As you can see above, NetBSD easily achieves around 8-10x the
> > speed using the exact same USB3 device as FreeBSD does, so the
> > hardware CAN push the speeds, just FreeBSD cannot.
> 
> The small Cwnd figures like 34.4 KBytes suggest that a
> small receiver window (Rwnd) might be specified in the
> TCP header that the destination provided for that
> transfer direction. Cwnd can increase up to the
> Rwnd unless duplicate ACKs or timeouts occur, as I
> understand. (I'm no expert.)
> 
> This is part of the reason I thought that posting the
> output (with Cwnd) for both transfer directions could
> be of use: it gives a hint about what is controlling
> the Cwnd if the two directions have widely different
> figures vs. if both directions get similarly small
> figures. Comparisons with the other OS's figures in
> both directions could possibly also be suggestive,
> or so I hoped.
> 
> May be the comparison with the figures that I reported
> gives someone some hints about what might be going on
> in the two contexts.
> 
> > Hence, my original post, what can I do to possibly get FreeBSD's
> > performance up to what the hardware can achieve?
> > 
> 
> Hopefully my notes prove of some use --but I'm not likely
> to directly solve the problem. It would be handy for
> me if USB3 EtherNet performed significantly better than
> I have observed.

Just for giggles, I used iperf3 to test UDP performance to eliminate
TCP, and yep, the physical interface is limited to about 91.1Mbps...

When usb3 interface transmits, I get:
gold,pts,/home/jmg,506$iperf3 -R -u -b 1000M -c 192.168.0.80
Connecting to host 192.168.0.80, port 5201
Reverse mode, remote host 192.168.0.80 is sending
[  5] local 192.168.0.2 port 42932 connected to 192.168.0.80 port 5201
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  11.0 MBytes  92.1 Mbits/sec  0.271 ms  773186/781071 (99%)
[  5]   1.00-2.00   sec  10.9 MBytes  91.1 Mbits/sec  0.303 ms  235306/243103 (97%)
[  5]   2.00-3.00   sec  10.9 MBytes  91.1 Mbits/sec  0.345 ms  236328/244125 (97%)
[  5]   3.00-4.00   sec  10.9 MBytes  91.1 Mbits/sec  0.354 ms  235799/243598 (97%)
[  5]   4.00-5.00   sec  10.9 MBytes  91.1 Mbits/sec  0.404 ms  233614/241411 (97%)
[  5]   5.00-6.00   sec  10.9 MBytes  91.1 Mbits/sec  0.464 ms  235559/243356 (97%)
[  5]   6.00-7.00   sec  10.9 MBytes  91.1 Mbits/sec  0.515 ms  236272/244070 (97%)
[  5]   7.00-8.00   sec  10.9 MBytes  91.1 Mbits/sec  0.139 ms  234354/242152 (97%)
[  5]   8.00-9.00   sec  10.9 MBytes  91.1 Mbits/sec  0.150 ms  236887/244684 (97%)
[  5]   9.00-10.00  sec  10.9 MBytes  91.1 Mbits/sec  0.149 ms  236459/244257 (97%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-12.22  sec   133 MBytes  91.1 Mbits/sec  0.000 ms  0/2974757 (0%)  sender
[  5]   0.00-10.00  sec   109 MBytes  91.2 Mbits/sec  0.149 ms  2893764/2971827 (97%)  receiver

iperf Done.

and when the USB3 ethernet receives:
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.0.2, port 20603
[  5] local 192.168.0.80 port 5201 connected to 192.168.0.2 port 25350
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  9.42 MBytes  79.0 Mbits/sec  0.186 ms  111653/118418 (94%)  
[  5]   1.00-2.00   sec  10.9 MBytes  91.1 Mbits/sec  0.165 ms  229690/237489 (97%)  
[  5]   2.00-3.00   sec  10.9 MBytes  91.1 Mbits/sec  0.459 ms  236066/243864 (97%)  
[  5]   3.00-4.00   sec  10.9 MBytes  91.1 Mbits/sec  0.219 ms  228274/236072 (97%)  
[  5]   4.00-5.00   sec  10.9 MBytes  91.1 Mbits/sec  0.168 ms  224079/231877 (97%)  
[  5]   5.00-6.00   sec  10.9 MBytes  91.1 Mbits/sec  0.157 ms  233127/240924 (97%)  
[  5]   6.00-7.00   sec  10.9 MBytes  91.1 Mbits/sec  0.157 ms  232302/240100 (97%)  
[  5]   7.00-8.00   sec  10.9 MBytes  91.1 Mbits/sec  0.376 ms  232491/240289 (97%)  
[  5]   8.00-9.00   sec  10.9 MBytes  91.1 Mbits/sec  0.197 ms  230407/238205 (97%)  
[  5]   9.00-10.00  sec  10.9 MBytes  91.1 Mbits/sec  0.158 ms  243574/251372 (97%)  
[  5]  10.00-10.38  sec  1.42 MBytes  31.6 Mbits/sec  0.136 ms  30894/31914 (97%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.38  sec   109 MBytes  87.8 Mbits/sec  0.136 ms  2232557/2310524 (97%)  receiver

The transmitter reports:
[  5]   0.00-10.00  sec  1.12 GBytes   959 Mbits/sec  0.000 ms  0/2311155 (0%)  sender

so, it is able to transmit at close to gigabit speed.

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
Received on Mon Jul 13 2020 - 17:03:54 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC