Re: ale(4): Problems with tso, rxcsum and/or txcsum

From: Ulrich Spörlein <uqs_at_spoerlein.net>
Date: Sat, 27 Jun 2009 19:11:11 +0200
Sorry for the long delay, I only now got around testing this more
thoroughly.

On Tue, 16.06.2009 at 19:17:40 +0900, Pyun YongHyeon wrote:
> On Tue, Jun 16, 2009 at 11:33:34AM +0200, Ulrich Sp??rlein wrote:
> > On Mon, 15.06.2009 at 21:51:54 +0900, Pyun YongHyeon wrote:
> > > On Mon, Jun 15, 2009 at 02:16:23PM +0200, Ulrich Sp??rlein wrote:
> > > > Hello Pyun,
> > > > 
> > > > I have connection problems with the onboard GigE of an Asus P5Q board, using a recent 8-CURRENT
> > > > 
> > > > ale0: <Atheros AR8121/AR8113/AR8114 PCIe Ethernet> port 0xdc00-0xdc7f mem 0xfe9c0000-0xfe9fffff irq 17 at device 0.0 on pci2
> > > > ale0: 960 Tx FIFO, 1024 Rx FIFO
> > > > ale0: Using 1 MSI messages.
> > > > ale0: 4GB boundary crossed, switching to 32bit DMA addressing mode.
> > > > miibus0: <MII bus> on ale0
> > > > ale0: Ethernet address: 00:24:8c:36:3e:10
> > > > ale0: [FILTER]
> > > > ale0: link state changed to UP
> > > > 
> > > > ale0_at_pci0:2:0:0:        class=0x020000 card=0x82261043 chip=0x10261969 rev=0xb0 hdr=0x00
> > > >     vendor     = 'Attansic (Now owned by Atheros)'
> > > >     device     = 'PCI-E ETHERNET CONTROLLER  (AR8121/AR8113 )'
> > > >     class      = network
> > > >     subclass   = ethernet
> > > > 
> > > > ale0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > > >         options=311b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,WOL_MCAST,WOL_MAGIC>
> > > >         ether 00:24:8c:36:3e:10
> > > >         inet 192.168.0.146 netmask 0xffffff00 broadcast 192.168.0.255
> > > >         media: Ethernet autoselect (100baseTX <full-duplex>)
> > > >         status: active
> > > > 
> > > > When transferring data to the machine at ~10MB/s (100Mbit network only) the ssh
> > > > connection will die after a couple of minutes with
> > > > 
> > > > Disconnecting: Bad packet length 1592360521.
> > > > 
> > > > After disabling tso, txcsum and rxcsum the connection seems to be
> > > > stable, though. I fail to figure out a pattern, though. Do I need to
> > > 
> > > Hmm, I think this is the second report that could be related with
> > > Rx checksum offloading. If disabling Rx checksum fix the issue, I
> > > have to disable it by default until I understand what's going on.
> > 
> > I really need to disable tso, rxcsum *and* txcsum to make this card work
> > stable. :/
> 
> Hmm, let's see which offload was broken. Disabling all offloads
> make it hard to find broken one.

Ok, disabling -rxcsum will make the connection stable. But when I enable
rxcsum again, it is also stable! It looks like it is not turned on
again. To sum it up:

1. doing nothing: ssh connection drops after a couple of minutes
2. ifconfig ale0 -rxcsum: ssh runs stable for dozens of minutes
3. ifconfig ale0 rxcsum: ssh runs stable for dozens of minutes (wtf?)

> > There is one other weirdness, though, regarding tso. I have been using a
> > netcat-blast test, where I "upload" /dev/zero to another machine, and
> > "download" it from the same machine.

Scrap all my previous findings regarding this issue. I re-ran the test
with three machines. So ale0 would download from machine A and upload to
machine B. No matter how I hard I try, I can always saturate the 100MBit
Ethernet in full duplex. Don't know how the previous numbers came about.

Thanks for your patience, but it looks like the rxcsum is indeed fishy
on this chip revision.

Cheers,
Ulrich Spörlein
-- 
http://www.dubistterrorist.de/
Received on Sat Jun 27 2009 - 15:11:13 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:50 UTC