Re: yongari nfe problems

From: Rainer Hurling <rhurlin_at_gwdg.de> Date: Tue, 03 Apr 2007 18:49:42 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:07 UTC

Pyun YongHyeon schrieb:
> On Mon, Apr 02, 2007 at 07:27:14PM +0200, Rainer Hurling wrote:
>  > Pyun YongHyeon schrieb:
>  > >On Sat, Mar 31, 2007 at 05:01:18PM +0200, Rainer Hurling wrote:
>  > > > Thank you Pyun YongHyeon for the newest patch. I am running it with 
>  > > > if_nfe.c and if_nfereg.h from 03/21/2007 and if_nfevar.h from 
>  > > 03/19/2007 > on FreeBSD 7.0-CURRENT (i386) from today.
>  > > > 
>  > > > boot -v gives me:
>  > > > nfe0: <NVIDIA nForce MCP55 Networking Adapter> port 0xb000-0xb007 mem 
>  > > > xfbef3000-0xfbef3fff,0xfbefa800-0xfbefa8ff,0
>  > > > xfbefa400-0xfbefa40f irq 22 at device 8.0 on pci0
>  > > > nfe0: Reserved 0x1000 bytes for rid 0x10 type 3 at 0xfbef3000
>  > > > miibus0: <MII bus> on nfe0
>  > > > ciphy0: <VSC8601 10/100/1000TX PHY> PHY 1 on miibus0
>  > > > ciphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
>  > > > 1000baseT-FDX, auto
>  > > > nfe0: bpf attached
>  > > > nfe0: Ethernet address: 00:16:17:95:d9:7c
>  > > > nfe0: [MPSAFE]
>  > > > nfe0: [FILTER]
>  > > > 
>  > > > 
>  > > > Now there are no more warning from miibus0 :-)
>  > > > 
>  > >
>  > >Thanks for testing.
>  > >
>  > > > Unfortunately at bigger network transfers I still observe the 
>  > > previously > described watchdog timeouts:
>  > > > 
>  > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering
>  > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering
>  > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering
>  > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering
>  > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering
>  > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering
>  > > > ...
>  > > > 
>  > > > During these timeouts I am not able to use my network ;-(
>  > > > 
>  > > > I would be happy if I could help solving this problem. Let me know if I 
>  > > > can test anything.
>  > > > 
>  > >
>  > >Does nfe(4) use shared interrupt with other devices?
>  > >(Check 'vmstat -i' output.)
>  > 
>  > 
>  > #vmstat -i
>  > interrupt                          total       rate
>  > irq1: atkbd0                       10848          1
>  > irq12: psm0                        79500          7
>  > irq14: ata0                       102455         10
>  > irq16: sym0                           14          0
>  > irq17: nvidia0                    632579         61
>  > irq21: pcm0 ohci0                  30994          3
>  > irq22: nfe0 ehci0                  36673          3
>           ^^^^^^^^^^
> 
> You use shared interrupt. :-(

Yes, that's it. Both units are on the mainboard.

In "man ehci(4)" I found:

-------
BUGS
      The driver is not finished and is quite buggy.
      There is currently no support for isochronous transfers.
-------

Possibly this could cause the observed "dropouts" of nfe0 from a few 
seconds till several minutes?

> 
>  > irq23: atapci1                    143425         14
>  > cpu0: timer                     20480047       1999
>  > cpu1: timer                     20466044       1998
>  > Total                           41982579       4099
>  > 
>  > 
>  > >Since the watchdog timeout error indicates you've had missing Tx
>  > >completion interrupts I guess you've lost Tx completion interrupts
>  > >under high systems loads. One of major changes in new nfe(4) was
>  > >switching to so-called adaptive polling and it is known to give better
>  > >performance. However it can loose interrupts under high system loads
>  > >(e.g. buildworld) and I guess there are two ways to fix the issue.
>  > >
>  > >1. Add MSI/MSI-X support.
>  > > I think this is the cleanest solution to the issue. But old
>  > > hardwares which has no MSI/MSI-X support and buggy PCI bridges may
>  > > have issues dealing with MSI/MSI-X. In addition, there is no public
>  > > documentation available for NVIDIA NICs and lack of MSI/MSI-X capable
>  > > hardwares make me hard to add MSI/MSI-X support. AFAIK, Shigeaki
>  > > Tagashira is working on supporting MSI/MSI-X.(CCed)
>  > 
>  > dmesg shows on my MCP55 system:
>  > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
>  > pci0: <ACPI PCI bus> on pcib0
>  > pcib0: HT Bridge at 0:5:0 has non-default MSI window 0xc02000a
>  > pcib0: HT Bridge at 0:5:1 has non-default MSI window 0x602000a
>  > pcib0: HT Bridge at 0:6:1 has non-default MSI window 0x0
>  > pcib0: HT Bridge at 0:8:0 has non-default MSI window 0x75011
> 
> I'm not sure what non-default MSI window have influence on MSI
> support code. Maybe jhb has better idea.(CCed)

It was just a guess.

>  > pci0: <memory, RAM> at device 0.0 (no driver attached)
>  > 
>  > A more comprehensive info of 'boot -v' you can find as attachement. I 
>  > snipped a few lines because they are not necessary in this context (cpu, 
>  > pcm0, ad, acd, ...).
>  > 
>  > >2. polling(4)
>  > > Because polling(4) does not rely on timed-delivery of Tx interrupts
>  > > it would help in your case.
>  > 
>  > Is polling in classical sense the right way for this new driver with 
>  > 'adaptive polling'?
>  > 
>  > I think you could be right when assuming inadequate MSI/MSI-X support 
>  > for the MCP55 chipset.
>  > 
> 
> Personally I don't like polling(4) due to latency issues but it
> seems that there is no easy way to work-around until nfe(4) get
> working MSI/MSI-X support.
> Alternatively, if you don't use USB at all you can completely
> disable USBs and can avoid the use of shared interrupt with USB
> devices.

Is there a knob or option in driver nfe(4) I can use to try classical 
polling or any 'lower' mode of operation?

Rainer Hurling