Re: re(4) driver dropping packets when reading NFS files

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Mon, 1 Nov 2010 18:18:13 -0400 (EDT)
> On Sun, Oct 31, 2010 at 05:46:57PM -0400, Rick Macklem wrote:
> > I recently purchased a laptop that has a re(4) Realtek
> > 8101E/8102E/8103E net
> > chip in it and I find that it is dropping packets like crazy when
> > reading
> > files over an NFS mount. (It seems that bursts of receive traffic
> > cause it,
> > since when I look over wireshark, typically the 2nd packet in a read
> > reply
> > is not received, although it was sent at the other end.)
> >
> 
> Are you using NFS over UDP?

The test I referred to was over TCP, which works fine until reading a
file and then about the second TCP data segment that is sent by the
server isn't received by the client with the re(4). (I had tcpdump
capturing on both machines and then compared them using wireshark.)
The read does progress slowly, after TCP retransmits the segment that
was dropped. The result is a rate of about 10 reads/sec.

A test over UDP gets nowhere. You just gets lots of
  "IP fragments timed out" when you "netstat -s", so it seems to
  consistently drop a fragment in the read UDP reply.
> 
> > Adding "options DEVICE_POLLING" helps a lot. (ie. order of magnitude
> > faster
> > reading) Does this hint that interrupts are being lost or delayed
> > too much?
> >
> 
> Actually I'm not a fan of polling(4) but re(4) controllers might be
> exceptional one due to controller limitation but order of magnitude
> faster indicates something is wrong in driver.
> 

Yep, I'd agree. I can print out the exact chip device info, but if you
don't have data sheets, it may not help. It seems to be a low end chip,
since it doesn't support 1Gbps --> closer to an 8139. It might be
called an 8036, since that # shows up in the device resources under
windoze.

> 
> AFAIK re(4) controllers lacks interrupts moderation so re(4) used
> to rely on taskqueue to reduce number of interrupts. It was written
> long time ago by Bill and I'm not sure whether it's still valid for
> recent PCIe RealTek controllers. One of problem is getting
> stand-alone PCIe controllers in market and I was not able to buy
> recent controllers. This is one of reason why re(4) still lacks TSO,
> jumbo frame and 64bit DMA support for newer controllers. Another
> problem is RealTek no longer releases data sheet so it's hard to
> write new features that may present on recent controllers.
> 
> Recent re(4) controllers started to support small set of hardware
> MAC statistics counters and that may help to understand how many
> frames were lost under heavy load. I'll let you know when I have a
> patch for that. Flow-control may also enhance performance a little
> bit but it was not implemented yet like most other consumer grade
> ethernet drivers. But this may change in near future, marius_at_ is
> actively working on this so we'll get generic flow-control
> framework in tree.

It drops a frame as soon as the read starts and there is a burst
of more than one. (I can email you the tcpdump captures if you're
interested and you won't have to look far into it to see it happen.)

It seems to do it consistently and then recovers when the TCP
segment is resent, but repeats the fun on the next one.
(I'm wondering if it can't support a 64 entry receive ring. I'll
 try making it smaller and see what happens? Probably won't help,
 but can't hurt to try:-)

> 
> I'll see what can be done in interrupt handler and I'll let you
> know when patch is ready.
> 
> > Thanks, rick
> > ps: This laptop is running a low end AMD cpu and I did install amd64
> > on it,
> >     instead of i386, in case that might be relevent?
> 
> I don't think so.
> 
Ok. I didn't think so, but someone recently mentioned that some drivers
for wifi chips don't work for amd64.

It actually works fairly well (and quite well with DEVICE_POLLING), except
for this issue where it drops received packets when it gets bursts of them.
(It almost looks like it only handles the first received packet, although
 it appears to be using a receive ring of 64 buffers.)

Anyhow, I'll keep poking at it and will appreciate any patches/suggestions
that you might have.

Thanks, rick
Received on Mon Nov 01 2010 - 21:18:15 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC