Re: CURRENT: re(4) crashing system

From: YongHyeon PYUN <pyunyh_at_gmail.com>
Date: Mon, 24 Oct 2016 14:14:00 +0900
On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote:
> I tried to report earlier here that CURRENT does have some serious
> problems right now and one of those problems seems to be triggered by
> the recent re(4) driver. The problem is also present in recen 11-STABLE!
> 
> Below, you'll find pciconf-output reagrding the device on a Lenovo E540
> Laptop I can test on and trigger the problem.
> 
> The phenomenon is that this NIC does not negotiate 1000baseTX, it is
> always falling back to 100baseTX although the device claims to be a 1
> GBit capable device.
> 
> When I try to put the device manually into 1000basTX mode via
> 
> ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver)
> 
> it is possible to crash the system. The system also crashes when
> plugging/unplugging the LAN cord - I guess the renegotiation is
> triggering this crash immediately.
> 
> I tried with several switches and routers capable of 1 GBit and it
> seems to be independent from the network hardware in use.
> 
> I tried to capture a backtrace when the kernel crashes, but I do not
> know how to save the the kernel debugger output. Although I configured
> according the handbook debugging, there is no coredump at all.
> 
> Advice is appreciated - if anybody is interesetd in solving this. 
> 

There were several instability reports on re(4).  I vaguely guess
it would be related with some missing initializations for certain
controllers.  Unfortunately, there is no publicly available
datasheet for those controllers and it's not likely to get access
to it in near future.  It seems vendor's FreeBSD driver accesses
lots of magic registers as well as loading DSP fixups.  I have no
idea what it wants to do and re(4) used to heavily rely on power-on
default register values.  Engineering samples I have do not show
instabilities so it wouldn't be easy to identify the issue.

Probably the first step to address the issue would be identifying
those chips and narrowing down the scope of guessing.  Would you
show me the dmesg output(re(4) and regphy(4) only)?  pciconf(8)
output is useless here since RealTek uses the same PCI id for
PCIe variants.

BTW, I was told that the vendor's FreeBSD driver seems to work fine
for normal usage pattern.  The vendor's driver triggered an instant
panic and lacked H/W offloading features in the past.  It might
have changed though.
Received on Mon Oct 24 2016 - 03:14:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC