Re: nve related LOR triggered by lots of small packets, and a hard hang

From: Mark Atkinson <atkin901_at_yahoo.com>
Date: Mon, 12 Feb 2007 07:47:04 -0800
Pyun YongHyeon wrote:

> On Fri, Feb 09, 2007 at 09:23:41AM -0800, Mark Atkinson wrote:
>  > Mark Atkinson wrote:
>  > 
>  > > Pyun YongHyeon wrote:
>  > > 
>  > >> On Wed, Jan 10, 2007 at 06:53:31PM +0300, Sergey Zaharchenko wrote:
>  > >>  > Hello John!
>  > >>  > 
>  > >>  > Wed, Jan 10, 2007 at 09:10:12AM -0500 you wrote:
>  > >>  > [snip]
>  > >>  > > Have you tried using nfe(4)? :)
>  > >>  > 
>  > >>  > Now I have, and it works just fine, thanks (I somehow thought nfe
>  > >>  > was specific to some platform). Why isn't it the default? Smaller
>  > >>  > range of hardware supported?
>  > >>  > 
>  > >> 
>  > >> AFAIK, nfe(4) supports more hardwares than that of nve(4).
>  > >> Try overhauled nfe(4) in the following URL.
>  > >> 
>  > >> http://people.freebsd.org/~yongari/nfe/if_nfe.c
>  > >> http://people.freebsd.org/~yongari/nfe/if_nfereg.h
>  > >> http://people.freebsd.org/~yongari/nfe/if_nfevar.h
>  > >> 
>  > >> The patch fixed serveral bugs in nfe(4) and it should perform better
>  > >> than nve(4). The following hardware features are supported.
>  > >>  o TSO
>  > >>  o Tx/Rx IP/TCP/UDP checksum offload
>  > >>  o VLAN hardware tag insertion/stripping
>  > >>  o Jumbo frame(up to 9100 bytes)
>  > >> 
>  > >> It seems that the hardware supports MSI/MSI-X too but I don't have
>  > >> nForce hardwares that supports MSI/MSI-X so it's hard to implement/
>  > >> experiment it. Accoring to the Shigeaki Tagashira, the author of
>  > >> FreeBSD nfe(4), his hardware claims to support 8 messages. I've
>  > >> checked Linux forcedeth driver to get hardware information for
>  > >> MSI/MSI-X but it I cound't understand the details. :-(
>  > >> 
>  > > 
>  > > I've been running into this hardlock LOR a lot recently on a  TYAN
>  > > 2895
>  > > (K8WE) based box.   So I tried your patch to nfe on today's -current.
>  > >   I tried a couple of small packet ping floods to a lan neighbor
>  > > under nfe and
>  > > it survived.   Did fine with some large NFS over TCP transfers as
>  > > well. However, I'll leave it up and running to see if it keels over
>  > > in the future.
>  > > 
>  > > pci128: <ACPI PCI bus> on pcib6
>  > > pci128: physical bus=128
>  > > found-> vendor=0x10de, dev=0x005e, revid=0xa3
>  > >         bus=128, slot=0, func=0
>  > >         class=05-80-00, hdrtype=0x00, mfdev=0
>  > >         cmdreg=0x0006, statreg=0x00b0, cachelnsz=0 (dwords)
>  > >         lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
>  > > found-> vendor=0x10de, dev=0x00d3, revid=0xa3
>  > >         bus=128, slot=1, func=0
>  > >         class=05-80-00, hdrtype=0x00, mfdev=1
>  > >         cmdreg=0x000f, statreg=0x00a0, cachelnsz=0 (dwords)
>  > >         lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
>  > >         map[14]: type 1, range 32, base 0xd8400000, size 12, enabled
>  > > found-> vendor=0x10de, dev=0x0057, revid=0xa3
>  > >         bus=128, slot=10, func=0
>  > >         class=06-80-00, hdrtype=0x00, mfdev=0
>  > >         cmdreg=0x0007, statreg=0x00b0, cachelnsz=0 (dwords)
>  > >         lattimer=0x00 (0 ns), mingnt=0x01 (250 ns), maxlat=0x14 (5000
>  > >         ns) intpin=a, irq=5
>  > >         powerspec 2  supports D0 D1 D2 D3  current D0
>  > >         map[10]: type 1, range 32, base 0xd8401000, size 12, enabled
>  > >         map[14]: type 4, range 32, base 0x3000, size  3, enabled
>  > > pcib6: matched entry for 128.10.INTA (src \\_SB_.PCI1.LMAC:0)
>  > > pci_link22: Picked IRQ 52 with weight 0
>  > > ioapic3: Changing polarity for pin 20 to high
>  > > pcib6: slot 10 INTA routed to irq 52 via \\_SB_.PCI1.LMAC
>  > > found-> vendor=0x10de, dev=0x005d, revid=0xa3
>  > >         bus=128, slot=14, func=0
>  > >         class=06-04-00, hdrtype=0x01, mfdev=0
>  > >         cmdreg=0x0107, statreg=0x0010, cachelnsz=16 (dwords)
>  > >         lattimer=0x00 (0 ns), mingnt=0x04 (1000 ns), maxlat=0x00 (0
>  > >         ns)
>  > >         powerspec 2  supports D0 D3  current D0
>  > >         MSI supports 2 messages, 64 bit
>  > > pci128: <memory> at device 0.0 (no driver attached)
>  > > pci128: <memory> at device 1.0 (no driver attached)
>  > > nfe1: <NVIDIA nForce4 CK804 MCP9 Networking Adapter> port
>  > > 0x3000-0x3007 mem 0xd8
>  > > 401000-0xd8401fff irq 52 at device 10.0 on pci128
>  > > nfe1: Reserved 0x1000 bytes for rid 0x10 type 3 at 0xd8401000
>  > > nfe1: bpf attached
>  > > e1: Ethernet address: 00:e0:81:57:d9:af
>  > > miibus1: <MII bus> on nfe1
>  > > e1000phy1: <Marvell 88E1111 Gigabit PHY> PHY 1 on miibus1
>  > > e1000phy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
>  > > 1000baseTX-FDX, auto
>  > > ioapic3: routing intpin 20 (PCI IRQ 52) to vector 57
>  > > nfe1: [MPSAFE]
>  > > nfe1: [FAST]
>  > 
>  > After a day of running this, it became obvious the nfe driver patch has
>  > some
>  > sort of issue, at least with -current and this board.  Although NFS
>  > speeds seemed reasonable, transfers over TCP from a webserver suffered
>  > some sort
>  > of very noticeable pause/send/pause/send...  type problem that reduced
>  > transfers to about 6Kbyte/s.  This problem went away when putting nve
>  > back into the kernel and retrying the same scenerio.
>  > 
> 
> Would you explain the scenario to reproduce it on my box?
> How about disabling checksum offload?

After a few tests, it's all related to TSO ( segmentation offloading ) 

Turning that off, but leaving rxcsum and txcsum enabled works and performs
speedily.  Thanks for the suggestion!

Note that, for some reason NFS over tcp speeds didn't seem affected that
much only userland TCP seemed to be negatively affected.

-- 
Mark Atkinson
atkin901_at_yahoo.com
(!wired)?(coffee++):(wired);
Received on Mon Feb 12 2007 - 14:47:36 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC