Re: weird network problems on current since 10/28/2012

From: Andre Oppermann <andre_at_freebsd.org>
Date: Sun, 04 Nov 2012 14:57:39 +0100
On 04.11.2012 13:11, Kim Culhan wrote:
> On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
>> On 2012-11-04 02:13, Manfred Antar wrote:
>>> At 03:29 PM 11/3/2012, Adrian Chadd wrote:
>>>> On 3 November 2012 10:40, Manfred Antar <null_at_pozo.com> wrote:
>>>>> i have problem connecting to freebsd box on local network since last sunday.
>>>>> the last kernel that works:
>>>>>    FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
>>>>> anything after that, sometimes i can connect, other times just hangs.
>>>>> any network connection hangs ===== pop httpd ssh etc etc.
>>>>> anyone have any ideas ?
>>>>> i can checkout different sources and see if i can locate the changes that cause
>>>>> this.
>>>>
>>>> Please do!
>> ...
>>> Here is what I found doing :
>>> setenv CVSROOT /usr/home/ncvs
>>>
>>> cvs co -D"October 28, 2012 12:14:38 PDT" sys
>>>
>>> A kernel from that time works fine.
>>>
>>> doing:
>>>
>>> cvs up -D"October 28, 2012 13:14:38 PDT" sys                    1 hour later
>>> the following files were changed:
>>> sys/netinet/tcp_input.c
>>> sys/netinet/tcp_timer.c
>>> sys/netinet/tcp_var.h
>>>
>>> Building a kernel from these new files is when the problem starts.
>>
>> So, your problems seem to have been introduced by this commit by Andre:
>>
>>     http://svn.freebsd.org/changeset/base/242266
>>
>>     Increase the initial CWND to 10 segments as defined in IETF TCPM
>>     draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
>>     window improves the overall performance of many web services without
>>     risking congestion collapse.
>>
>>     As long as it remains a draft it is placed under a sysctl marking it
>>     as experimental:
>>      net.inet.tcp.experimental.initcwnd10 = 1
>>     When it becomes an official RFC soon the sysctl will be changed to
>>     the RFC number and moved to net.inet.tcp.
>>
>>     This implementation differs from the RFC draft in that it is a bit
>>     more conservative in the case of packet loss on SYN or SYN|ACK because
>>     we haven't reduced the default RTO to 1 second yet.  Also the restart
>>     window isn't yet increased as allowed.  Both will be adjusted with
>>     upcoming changes.
>>
>>     Is is enabled by default.  In Linux it is enabled since kernel 3.0.
>>
>> After the commit, there was a small discussion thread on svn-src-head_at_
>> about the possible problems with the approach.  Maybe you are
>> experiencing those?
>>
>> As the commit message says, you should be able to turn the feature off
>> using:
>>
>>     sysctl net.inet.tcp.experimental.initcwnd10=0
>>
>> Can you please try that, and see if the problems go away?
>
> FWIW this did not make the problem go away on 2 machines.

Yes, this very much looks like the same problem as in PR/173309.

Please try the attached patch.  It fixes the connection hang issue.
There may be a second issue I debugging currently base on the feedback
from Fabian Keil.

-- 
Andre

Index: tcp_input.c
===================================================================
--- tcp_input.c (revision 242494)
+++ tcp_input.c (working copy)
_at__at_ -2650,10 +2652,12 _at__at_

                 SOCKBUF_LOCK(&so->so_snd);
                 if (acked > so->so_snd.sb_cc) {
+                       tp->snd_wnd -= so->so_snd.sb_cc;
                         sbdrop_locked(&so->so_snd, (int)so->so_snd.sb_cc);
                         ourfinisacked = 1;
                 } else {
                         sbdrop_locked(&so->so_snd, acked);
+                       tp->snd_wnd -= acked;
                         ourfinisacked = 0;
                 }
                 /* NB: sowwakeup_locked() does an implicit unlock. */
Received on Sun Nov 04 2012 - 12:57:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:31 UTC