Re: excessive TCP duplicate acks?

From: Scott Long <scottl_at_samsco.org>
Date: Sat, 03 Mar 2007 09:28:38 -0700
Andre Oppermann wrote:
> Peter Jeremy wrote:
>> On 2007-Jan-26 11:59:06 -0500, Andrew Gallatin <gallatin_at_cs.duke.edu> 
>> wrote:
>>
>>> When running some benchmarks, I noticed tons of duplicate acks showing
>>> up in systat -tcp (thousands, or tens of thousands per second).
>>
>>
>> Whilst investigating other problems, I've just seen the same on 6.2.
>> The following trace was taken on 192.168.234.1, which is running
>> 6.2-RELEASE/i386 (with ipfilter enabled) with fxp (Intel 82559) NICs.
>> 192.168.234.64 is running 6.2-STABLE/amd64 from late January (no
>> firewall active) with a bge (Broadcom BCM5705 A3, ASIC rev. 0x3003)
>> NIC and checksum offloading enabled.
>>
>> The multiple SYN packets are due to a bug in the IPfilter state
>> management, though it eventually allows a SYN through.  (And it is not
>> totally unrealistic for multiple SYNs to be required before a SYN-ACK
>> is received so this does not excuse the ACK flood).  Note that the
>> duplicate ACKs are being sent from the host without a firewall so this
>> does not appear to be related to ipfilter (or kern/102653).
> 
> This thing is really strange and difficult to debug.  A look at the CVS 
> history
> of tcp_input/output doesn't show any smoking gun.  ACKs like these are 
> totally
> pointless.  There are three places able to cause ACKs: 1) tcp_input 
> decides to
> call tcp_output [not the case here as there are no corresponding input 
> packets
> to cause this]; 2) tcp_output has a unterminated loop somewhere causing 
> it to
> spew the ACKs in rapid succession [unlikely as it holds the tcpcb lock 
> and that
> would block inbound packets]; 3) tcp timers are misfiring or not 
> properly dis-
> armed [here the logic in tcp_output may/should just ignore it and return 
> w/o
> sending any packet].
> 
> I haven't experienced this bug myself which makes it even harder to debug.
> 

Just for fun, I wonder what would happen if HZ was set back to 100. 
It's not a fix, but it might point to some misconfigured timers.

Scott
Received on Sat Mar 03 2007 - 15:29:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:06 UTC