Re: CURRENT: re(4) crashing system

From: YongHyeon PYUN <pyunyh_at_gmail.com>
Date: Mon, 7 Nov 2016 11:16:23 +0900
On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote:
> On Mon, 31 Oct 2016 11:12:22 +0900
> YongHyeon PYUN <pyunyh_at_gmail.com> wrote:
> 
> > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote:
> > > On Thu, 27 Oct 2016 10:00:04 +0900
> > > YongHyeon PYUN <pyunyh_at_gmail.com> wrote:
> > >   
> > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote:  
> > > > > On Tue, 25 Oct 2016 11:05:38 +0900
> > > > > YongHyeon PYUN <pyunyh_at_gmail.com> wrote:
> > > > >     
> > > > 
> > > > [...]
> > > >   
> > > > > > I'm not sure but it's likely the issue is related with
> > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with
> > > > > > link partner. If you directly connect your laptop to non-EEE
> > > > > > capable link partner like other re(4) box without switches
> > > > > > you may be able to tell whether the issue is EEE/Green
> > > > > > Ethernet related one or not.    
> > > > > 
> > > > > Me either since when I discovered a problem the first time with
> > > > > CURRENT, that was the Friday before last week's Friday, there
> > > > > was a unlucky coicidence: I got the new switch, FreeBSD
> > > > > introduced a serious bug and I changed the NICs.
> > > > > 
> > > > > The laptop, the last in the row of re(4) equipted systems on
> > > > > which I use the Realtek NIC, does well now with Green IT
> > > > > technology, but crashes on plugging/unplugging - not on each
> > > > > event, but at least in one of ten.    
> > > > 
> > > > Hmm, it seems you know how to trigger the issue. When you unplug
> > > > UTP cable was there active network traffic on re(4) device?
> > > > It would be helpful to know which event triggers the crash(e.g.
> > > > unplugging or plugging).  And would you show me backtrace of
> > > > panic? 
> > > > > I guess the Green IT issue is more a unlucky guess of mine and
> > > > > went hand in hand with the problem I face with CURRENT right
> > > > > now on some older, Non UEFI machines.
> > > > >     
> > > > 
> > > > Ok.
> > > > 
> > > > [...]  
> > > > > 
> > > > > As requested the informations about re0 and rgephy0 on the
> > > > > laptop (Lenovo E540) 
> > > > > 
> > > > > [...]
> > > > > 
> > > > > rgephy0: <RTL8251 1000BASE-T media interface> PHY 1 on miibus0
> > > > > rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow,
> > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX,
> > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow,
> > > > > 1000baseT-FDX-flow-master, auto, auto-flow
> > > > > 
> > > > > re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet>
> > > > > port 0x3000-0x30ff mem
> > > > > 0xf0d04000-0xf0d04fff,0xf0d00000-0xf0d03fff at device 0.0 on
> > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip
> > > > > rev. 0x50800000 re0: MAC rev. 0x00100000    
> > > > 
> > > > This looks like 8168GU controller.
> > > > 
> > > > [...]
> > > >   
> > > > > I use options netmap in kernel config, but the problem is also
> > > > > present without this option - just for the record.
> > > > >     
> > > > 
> > > > Yup, netmap(4) has nothing to do with the crash.
> > > > 
> > > > Thanks.  
> > > 
> > > Attached, you'll find the backtrace of the crash. This time it was
> > > really easy - just one pull of the LAN cabling - and we are
> > > happy :-/
> > > 
> > > Please let me know if you need something else. I will return to
> > > normal operations (disabling debugging) due to CURRENT is very
> > > unstable at the moment on other hosts beyond r307157.
> > >   
> > 
> > It seems the attachment was stripped.
> 
> This time I hope I got it right!
> 
> Attached you'll find the latest CURRENT's backtrace on the provoked
> crash (plug and unplug).
> 
> I also saved the kernel and coredump, so if you need me to do further
> investigations,please let me know.
> 

Thanks a lot for the backtrace.  This backtrace is not the one I
expected and I guess the issue is related with cached route removal
on interface down.  Quick looking over the code didn't reveal the
cause of crash(I'm not familiar with that part code).  Probably
gnn_at_ may have better idea what's going on here(CCed).

Thanks.
Received on Mon Nov 07 2016 - 01:25:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC