Re: weird network problems on current since 10/28/2012

From: Andre Oppermann <andre_at_freebsd.org>
Date: Mon, 05 Nov 2012 10:19:22 +0100
On 05.11.2012 02:39, Manfred Antar wrote:
> At 01:57 PM 11/4/2012, you wrote:
>> On 04.11.2012 21:15, Andreas Tobler wrote:
>>> On 04.11.12 14:57, Andre Oppermann wrote:
>>>> On 04.11.2012 13:11, Kim Culhan wrote:
>>>>> On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
>>>>>> On 2012-11-04 02:13, Manfred Antar wrote:
>>>>>>> At 03:29 PM 11/3/2012, Adrian Chadd wrote:
>>>>>> After the commit, there was a small discussion thread on svn-src-head_at_
>>>>>> about the possible problems with the approach.  Maybe you are
>>>>>> experiencing those?
>>>>>>
>>>>>> As the commit message says, you should be able to turn the feature off
>>>>>> using:
>>>>>>
>>>>>>      sysctl net.inet.tcp.experimental.initcwnd10=0
>>>>>>
>>>>>> Can you please try that, and see if the problems go away?
>>>>>
>>>>> FWIW this did not make the problem go away on 2 machines.
>>>>
>>>> Yes, this very much looks like the same problem as in PR/173309.
>>>>
>>>> Please try the attached patch.  It fixes the connection hang issue.
>>>> There may be a second issue I debugging currently base on the feedback
>>> >from Fabian Keil.
>>>
>>> I jump into this thread since I have a similar network issue.
>>>
>>> My scenario:
>>>
>>> 'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
>>> The nfs drive on the server is an ufs fs. No zfs.
>>>
>>> Up to r242261 I can install the kernel (or world) in a fluent way to the
>>> nfs destination.
>>>
>>> >From r242262 it doesn't work smooth. I have stalls, sometimes my
>>> patience is not enough and I kill the process.
>>>
>>> I tried 242266 with the above mentioned patch. No real success.
>>>
>>> How can I help/test?
>>
>> Please try the attach patch instead of the above mentioned one.
>>
>> --
>> Andre
>>
>> Index: netinet/tcp_output.c
>> ===================================================================
>> --- netinet/tcp_output.c        (revision 242577)
>> +++ netinet/tcp_output.c        (working copy)
>> _at__at_ -228,7 +228,7 _at__at_
>>         tso = 0;
>>         mtu = 0;
>>         off = tp->snd_nxt - tp->snd_una;
>> -       sendwin = min(tp->snd_wnd, tp->snd_cwnd);
>> +       sendwin = ulmax(ulmin(tp->snd_wnd - off, tp->snd_cwnd), 0);
>>
>>         flags = tcp_outflags[tp->t_state];
>>         /*
>> _at__at_ -249,7 +249,7 _at__at_
>>             (p = tcp_sack_output(tp, &sack_bytes_rxmt))) {
>>                 long cwin;
>>
>> -               cwin = min(tp->snd_wnd, tp->snd_cwnd) - sack_bytes_rxmt;
>> +               cwin = ulmin(tp->snd_wnd - off, tp->snd_cwnd) - sack_bytes_rxmt;
>>                 if (cwin < 0)
>>                         cwin = 0;
>>                 /* Do not retransmit SACK segments beyond snd_recover */
>> _at__at_ -355,7 +355,7 _at__at_
>>                          * sending new data, having retransmitted all the
>>                          * data possible in the scoreboard.
>>                          */
>> -                       len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd)
>> +                       len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd - off)
>>                                - off);
>>                         /*
>>                          * Don't remove this (len > 0) check !
>
> This doesn't seem to make a difference.
> I have a ssh window thats been trying to connect for the past 5 minutes.
> This is on a local network 192.168.0.4  >===========SSH==============> 192.168.0.5
> Also pop from the same machines endless trying to connect.
> Hopefully this mail will get thru , otherwise i will need to reboot to old kernel

I've backed out the change with r242601 as it exhibits still too
many problems.  I'll fix these problems in the next days but in
the mean time HEAD should be in a working state.

I'm sorry for the trouble.

-- 
Andre
Received on Mon Nov 05 2012 - 08:19:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:31 UTC