Ok, some new testing... nc w/ 8MB sockbuf and {send,recv}buf_max can get over 50MB/sec using nc and a 100ms RTT.. Guess it's a good day... The hickups that I see look to be a buffer overrun on the tap or bridge interface, as I did see some errors, and on the sender side, the TCP protocol stats showed some retransmits.. A quick recap of the info below at 50ms RTT: server client HEAD w/ HPN stock 7.1p1 HEAD w/ HPN 1MB/sec 6.5MB/sec stock 7.1p1 2MB/sec 7MB/sec TL;DR: HPN patches hurt no matter how you look at things... client HEAD w/ HPN, the tcpwinsz is always 66608 the server HEAD w/ HPN, the rcvd adjust is always 49152 the server stock 7.1p1, the rcvd adjust is always 131072 client HPN: OpenSSH_6.6.1p1, OpenSSL 1.0.1m-freebsd 19 Mar 2015 client stock: OpenSSH_7.1p1, OpenSSL 1.0.1m-freebsd 19 Mar 2015 server HPN: OpenSSH_6.6.1p1, OpenSSL 1.0.1o-freebsd 12 Jun 2015 server stock: OpenSSH_7.1p1, OpenSSL 1.0.1o-freebsd 12 Jun 2015 All tests below were done w/ a 25ms pipe delay, resulting in a 50ms RTT... Baseline nc results in ~55MB/sec, but this is cpu limited by bhyve... The dual stock 7.1p is only using about a quarter of the cpu, so not at all cpu limited... Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime tcp4 0 2851352 192.168.0.21.44972 192.168.0.50.3845 0 1052 0 1043 66608 2851352 1 2048 0 4262912 532864 8388608 0.26 0.00 7199.99 0.00 0.00 0.00 I forgot about -vvv... So, w/ HEAD ssh (w/ HPN) on both sides, the transmitter, I see lots of: debug2: channel 0: rcvd adjust 49152 debug2: tcpwinsz: 66608 for connection: 3 debug2: tcpwinsz: 66608 for connection: 3 debug2: tcpwinsz: 66608 for connection: 3 debug2: tcpwinsz: 66608 for connection: 3 debug2: tcpwinsz: 66608 for connection: 3 And the send-q is 49312: Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime tcp4 0 49312 192.168.0.21.51135 192.168.0.50.22 0 15 0 13 66608 98840 1 2048 0 55040 532864 790720 0.21 0.00 7199.95 0.00 0.00 0.04 And adjusting the delay to 0 shows ssh getting 50MB/sec, so it should be possible... adjusting the delay shows a linear change in bandwidth (200ms -> 500KB/sec, 100ms -> 1MB/sec, 50ms -> 2MB/sec), so looks like ssh is artifically limitting the window... If I run w/ stock, compiled from source, OpenSSH 7.1p1 on the server side (receiver side) and HEAD ssh on client side, I get ~6.5MB/sec.. w/ -vvv I see: debug2: channel 0: rcvd adjust 131072 debug2: tcpwinsz: 66608 for connection: 3 debug2: tcpwinsz: 66608 for connection: 3 debug2: tcpwinsz: 66608 for connection: 3 [this line repeated tons of times] In this case, I do see the send-q stay at around 229912: Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime tcp4 0 229912 192.168.0.21.55975 192.168.0.50.22 0 77 0 77 66608 229912 1 2048 0 316672 532864 1839296 0.24 0.00 7199.99 0.00 0.00 0.00 With compiled from source, OpenSSH 7.1.p1 on both sides, I get around 7MB/sec and -vvv gives: debug2: channel 0: rcvd adjust 131072 And the send-q stays around 229912: Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime tcp4 0 229912 192.168.0.21.10699 192.168.0.50.22 0 80 0 80 66608 229912 1 2048 0 339968 532864 1839296 0.25 0.00 7199.99 0.00 0.00 0.00 With compiled from source on the client, and HEAD OpenSSH w/ HPN on the server side, 2MB/sec, -vvv shows: debug2: channel 0: rcvd adjust 49152 and the send-q stays around 49296: Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime tcp4 0 49296 192.168.0.21.43918 192.168.0.50.22 0 15 0 13 66608 98840 1 2048 0 55040 532864 790720 0.20 0.00 7199.95 0.00 0.00 0.04 /etc/ssh/ssh{,d}_config are all stock (except PermitRootLogin is yes on the VM), and the client side's .ssh/config only contains a single commented out line... P.S. I've had two panics on the dummynet box (running nc) the last being: Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80963615 stack pointer = 0x28:0xfffffe001de65580 frame pointer = 0x28:0xfffffe001de655d0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (dummynet) [ thread pid 0 tid 100084 ] Stopped at __mtx_lock_flags+0x55: movq (%r13),%rax db> bt Tracing pid 0 tid 100084 td 0xfffff8000505c000 __mtx_lock_flags() at __mtx_lock_flags+0x55/frame 0xfffffe001de655d0 doselwakeup() at doselwakeup+0xb5/frame 0xfffffe001de65610 sowakeup() at sowakeup+0x3b/frame 0xfffffe001de65640 tcp_do_segment() at tcp_do_segment+0x26b9/frame 0xfffffe001de65730 tcp_input() at tcp_input+0xfb7/frame 0xfffffe001de65880 ip_input() at ip_input+0x175/frame 0xfffffe001de658e0 netisr_dispatch_src() at netisr_dispatch_src+0x86/frame 0xfffffe001de65950 dummynet_send() at dummynet_send+0x153/frame 0xfffffe001de65980 dummynet_task() at dummynet_task+0x2e3/frame 0xfffffe001de659e0 taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xfffffe001de65a40 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe001de65a70 fork_exit() at fork_exit+0x84/frame 0xfffffe001de65ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001de65ab0 --- trap 0, rip = 0, rsp = 0xfffffe001de65b70, rbp = 0 --- db> I wasn't able to get a dump because my swap was misconfigured... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."Received on Sat Nov 14 2015 - 18:34:28 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:00 UTC