Re: CURRENT: ipfw: problems with timeouts and worse network performance

From: O. Hartmann <ohartman_at_zedat.fu-berlin.de>
Date: Fri, 20 May 2016 16:23:01 +0200
Am Fri, 20 May 2016 16:01:17 +0200
Jan Bramkamp <crest_at_rlwinm.de> schrieb:

> On 20/05/16 15:51, Vladimir Zakharov wrote:
> > On Fri, May 20, 2016, Jan Bramkamp wrote:  
> >> On 20/05/16 14:54, Vladimir Zakharov wrote:  
> >>> Hello
> >>>
> >>> On Fri, May 20, 2016, O. Hartmann wrote:  
> >>>> I reported earlier about broken pipes in ssh sessions to remote hosts,
> >>>> which occur on an erratic basis. i'm investigating this problem now and
> >>>> it seems that it is also ipfw-related, but I'm not sure. This problem
> >>>> is present since a couple of weeks now.  
> >>>
> >>> Maybe this could help...
> >>>
> >>> I've also experienced problems with broken pipes in ssh sessions some
> >>> time ago. Setting in sysctl.conf
> >>>
> >>> net.inet.ip.fw.dyn_ack_lifetime=3600
> >>>
> >>> fixed problem for me. I didn't experiment with the value though. So,
> >>> possibly, changing default value (300s) to 1 hour is overkill :).  
> >>
> >> By default the OpenSSH SSH client is configured to use TCP keepalives.
> >> Those should produce enough packets at a short enough interval to keep
> >> the dynamic IPFW state established.
> >>
> >> Does your traffic pass through libalias?  
> > I guess not. How can I be sure?  
> 
> Libalias is used by ipfw and the old userland natd to implement IPv4 
> NAT. It requires unmodified access to all packets including their 
> headers. LRO and TSO coalesce packets to reduce save CPU time but the 
> process is loses some of the information required by libalias. Unless 
> your ruleset uses ipfw in-kernel NAT or diverts traffic to natd you 
> don't have to worry about libalias.
> 
> Use `kldstat -v | grep libalias` to check for libalias in the running 
> kernel and `pgrep natd` to search for running natd instances.


As I replied earlier, in my case it is manyfolded - I use In-kernel-NAT as well as
straight forward filtering.

The problem of broken pipes in ssh occured simultanously on ALL CURRENT boxes (a bunch of
different age, CPU types, NICs and configs, but all most recent CURRENT) at the same time.

Those massive dropouts and timeouts I whitnessed now occured after we updated from
~CURRENT 300005 to CURRENT 300158. On some experimental servers, the config, especially
that of ipfw, has not changed over half a year by now - but they suffer also from the
problem I described and the problem can be solved by disabling IPFW.

The problem is simpel to trigger: have firewall type "WORKSTATION" configured, IPFW
active (I have IPFW statically in-kernel-compiled, no modules so far). Have /usr/src as a
svn+https repository and try a svn update of the source tree.

I haven't checked so far whether the problem occurs also with non-SSL connections since
all connections I see the suffering are somehow encrypted.

Regards,
Oliver

Received on Fri May 20 2016 - 12:21:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC