Re: weird network problems on current since 10/28/2012

From: Manfred Antar <null_at_pozo.com>
Date: Sun, 04 Nov 2012 17:39:10 -0800
At 01:57 PM 11/4/2012, you wrote:
>On 04.11.2012 21:15, Andreas Tobler wrote:
>>On 04.11.12 14:57, Andre Oppermann wrote:
>>>On 04.11.2012 13:11, Kim Culhan wrote:
>>>>On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
>>>>>On 2012-11-04 02:13, Manfred Antar wrote:
>>>>>>At 03:29 PM 11/3/2012, Adrian Chadd wrote:
>>>>>After the commit, there was a small discussion thread on svn-src-head_at_
>>>>>about the possible problems with the approach.  Maybe you are
>>>>>experiencing those?
>>>>>
>>>>>As the commit message says, you should be able to turn the feature off
>>>>>using:
>>>>>
>>>>>     sysctl net.inet.tcp.experimental.initcwnd10=0
>>>>>
>>>>>Can you please try that, and see if the problems go away?
>>>>
>>>>FWIW this did not make the problem go away on 2 machines.
>>>
>>>Yes, this very much looks like the same problem as in PR/173309.
>>>
>>>Please try the attached patch.  It fixes the connection hang issue.
>>>There may be a second issue I debugging currently base on the feedback
>>>from Fabian Keil.
>>
>>I jump into this thread since I have a similar network issue.
>>
>>My scenario:
>>
>>'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
>>The nfs drive on the server is an ufs fs. No zfs.
>>
>>Up to r242261 I can install the kernel (or world) in a fluent way to the
>>nfs destination.
>>
>>>From r242262 it doesn't work smooth. I have stalls, sometimes my
>>patience is not enough and I kill the process.
>>
>>I tried 242266 with the above mentioned patch. No real success.
>>
>>How can I help/test?
>
>Please try the attach patch instead of the above mentioned one.
>
>-- 
>Andre
>
>Index: netinet/tcp_output.c
>===================================================================
>--- netinet/tcp_output.c        (revision 242577)
>+++ netinet/tcp_output.c        (working copy)
>_at__at_ -228,7 +228,7 _at__at_
>        tso = 0;
>        mtu = 0;
>        off = tp->snd_nxt - tp->snd_una;
>-       sendwin = min(tp->snd_wnd, tp->snd_cwnd);
>+       sendwin = ulmax(ulmin(tp->snd_wnd - off, tp->snd_cwnd), 0);
>
>        flags = tcp_outflags[tp->t_state];
>        /*
>_at__at_ -249,7 +249,7 _at__at_
>            (p = tcp_sack_output(tp, &sack_bytes_rxmt))) {
>                long cwin;
>
>-               cwin = min(tp->snd_wnd, tp->snd_cwnd) - sack_bytes_rxmt;
>+               cwin = ulmin(tp->snd_wnd - off, tp->snd_cwnd) - sack_bytes_rxmt;
>                if (cwin < 0)
>                        cwin = 0;
>                /* Do not retransmit SACK segments beyond snd_recover */
>_at__at_ -355,7 +355,7 _at__at_
>                         * sending new data, having retransmitted all the
>                         * data possible in the scoreboard.
>                         */
>-                       len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd)
>+                       len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd - off)
>                               - off);
>                        /*
>                         * Don't remove this (len > 0) check !

This doesn't seem to make a difference.
I have a ssh window thats been trying to connect for the past 5 minutes.
This is on a local network 192.168.0.4  >===========SSH==============> 192.168.0.5 
Also pop from the same machines endless trying to connect.
Hopefully this mail will get thru , otherwise i will need to reboot to old kernel
Manfred


========================
||      null_at_pozo.com           ||
||      Ph. (415) 681-6235      ||
======================== 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Received on Mon Nov 05 2012 - 00:49:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:31 UTC