Re: excessive TCP dulplicate acks revisted

From: Gregory Wright <gwright_at_antiope.com>
Date: Mon, 19 Nov 2007 13:36:37 -0500
> Gregory Wright wrote:
>> On Nov 11, 2007, at 5:23 PM, Andre Oppermann wrote:
>>> Gregory Wright wrote:
>>>> On Nov 10, 2007, at 10:28 AM, Andre Oppermann wrote:
>>>>>
>>>> Hi Andre,
>>>> I also took a look at the bge (4) driver in 7.0-BETA2.  As far  
>>>> as I can tell,
>>>> it does not support TSO (there is no ioctl supporting TSO enable/ 
>>>> disable
>>>> as there is for the em(4) driver).
>>>
>>>> Might the chip --- a BCM5704_B0 --- not be completely  
>>>> initialized?  This
>>>> might explain why the machine with the BCM5714_B3 chips works,  
>>>> while
>>>> the other machine shows the duplicate ACK bug.
>>>
>>> Perhaps.  Do you see the duplicate ACKs in a tcpdump on both the  
>>> sender
>>> and the receiver?  If you see it on the sender too, then it must  
>>> be a
>>> bug in our network stack or the driver (by requeuing the same packet
>>> over and over again).
>>>
>>> --Andre
>> The logs show that the duplicate ACKs are generated only by the
>> receiver.  I suspect a bug in the driver, perhaps the ACK packet
>> is not being removed from the TX buffer ring.  Examining the  
>> transmitted
>> packets should be enough to rule out a network stack problem.  Is
>> there any debugging infrastructure I can use or do I just have to
>> hack in on my own?
>
> We don't have an infrastructure to deal with this kind of driver
> problems.  You have to instrument the driver code to report stuck
> mbufs.
>

Hi Andre,

I have some additional information that indicates this is a driver bug.
There was a report to one of the Gentoo linux mailing lists of the same
problem with BCM5704s, in which everything worked at 1 Gb/s, but
duplicate ACKs were seen at 100 Mb/s.  Link to the message:

http://forums.gentoo.org/viewtopic-t-530707-highlight-bcm5704.html

The report said that the problem was solved by upgrading the linux
kernel from 2.6.17 to 2.6.18.  I've compared the tg3 drivers in the two
releases are were quite a few changes, so it will take a while to track
down what the key fix was.

So the bug in the bge driver for these chips can likely be fixed.

Thanks for your help.

Best Wishes,
Greg
Received on Mon Nov 19 2007 - 17:54:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:22 UTC