Re: Call for bfe(4) testers.

From: Pyun YongHyeon <pyunyh_at_gmail.com>
Date: Mon, 4 Aug 2008 10:02:05 +0900
On Sun, Aug 03, 2008 at 12:56:27PM +0200, Ulrich Spoerlein wrote:
 > On Sun, 03.08.2008 at 17:17:30 +0900, Pyun YongHyeon wrote:
 > > On Sat, Aug 02, 2008 at 11:28:30AM +0200, Ulrich Spoerlein wrote:
 > > > On Wed, 30.07.2008 at 20:34:49 +0900, Pyun YongHyeon wrote:
 > > > > If you have one of bfe(4) hardwares please give it try and let me
 > > > > know how it goes. The latest bfe(4) can be found at the following
 > > > > URL.
 > > > > http://people.freebsd.org/~yongari/bfe/if_bfe.c
 > > > > http://people.freebsd.org/~yongari/bfe/if_bfereg.h
 > > > 
 > > > Hi Pyun,
 > > > 
 > > > I recompiled a fresh RELENG_7 kernel with the files above, and it led to
 > > > a panic, shortly after the system was up. I can't provide you with a
 > > > dump for now, but here's the handwritten backtrace:
 > > > 
 > > > last dmesg: no toe capability of 0xc421d800
 > > > trace:
 > > >         device_is_attached
 > > >         cf_set_method
 > > >         cpufreq_curr_sysctl
 > > >         sysctl_root
 > > >         userland_sysctl
 > > > 
 > > 
 > > I can't reproduce this and the backtrace does not seem to be
 > > related with bfe(4). Did bfe(40 spew some error messages?
 > 
 > Not directly, there are lots of
 > 
 > no toe capability on 0xc422b000
 > no toe capability on 0xc422b000
 > no toe capability on 0xc40abc00
 > no toe capability on 0xc422b000
 > no toe capability on 0xc40abc00
 > no toe capability on 0xc422b000
 > no toe capability on 0xc40abc00
 > no toe capability on 0xc422b000
 > no toe capability on 0xc40abc00
 > no toe capability on 0xc422b000
 > no toe capability on 0xc40abc00
 > no toe capability on 0xc40abc00
 > 
 > messages, but they don't seem the culprit. The stats sysctl also works

I think kmacy_at_ fixed this. Please update again.

 > fine, here's a sample output:
 > bfe0 statistics:
 > Transmit good octets : 202779
 > Transmit good frames : 1884
 > Transmit octets : 202779
 > Transmit frames : 1884
 > Transmit broadcast frames : 4
 > Transmit multicast frames : 4
 > Transmit frames 64 bytes : 28
 > Transmit frames 65 to 127 bytes : 1741
 > Transmit frames 128 to 255 bytes : 63
 > Transmit frames 256 to 511 bytes : 52
 > Transmit frames 512 to 1023 bytes : 0
 > Transmit frames 1024 to max bytes : 0
 > Transmit jabber errors : 0
 > Transmit oversized frames : 0
 > Transmit fragmented frames : 0
 > Transmit underruns : 0
 > Transmit total collisions : 0
 > Transmit single collisions : 0
 > Transmit multiple collisions : 0
 > Transmit excess collisions : 0
 > Transmit late collisions : 0
 > Transmit deferrals : 0
 > Transmit carrier losts : 0
 > Transmit pause frames : 0
 > Receive good octets : 210840
 > Receive good frames : 1465
 > Receive octets : 210840
 > Receive frames : 1465
 > Receive broadcast frames : 0
 > Receive multicast frames : 0
 > Receive frames 64 bytes : 1
 > Receive frames 65 to 127 bytes : 1289
 > Receive frames 128 to 255 bytes : 122
 > Receive frames 256 to 511 bytes : 17
 > Receive frames 512 to 1023 bytes : 0
 > Receive frames 1024 to max bytes : 36
 > Receive jabber errors : 0
 > Receive oversized frames : 0
 > Receive fragmented frames : 0
 > Receive missed frames : 0
 > Receive CRC align errors : 0
 > Receive undersized frames : 0
 > Receive CRC errors : 0
 > Receive align errors : 0
 > Receive symbol errors : 0
 > Receive pause frames : 0
 > Receive control frames : 0
 > 
 > I have this device:
 > dev.bfe.0.%desc: Broadcom BCM4401 Fast Ethernet
 > dev.bfe.0.%driver: bfe
 > dev.bfe.0.%location: slot=0 function=0
 > dev.bfe.0.%pnpinfo: vendor=0x14e4 device=0x4401 subvendor=0x1028 subdevice=0x8127 class=0x020000
 > dev.bfe.0.%parent: pci2
 > dev.miibus.0.%parent: bfe0
 > 
 > 
 > bfe(4) even works for some minutes, then the machine panics because of
 > powerd(8) (????)
 > 
 > Fatal trap 12: page fault while in kernel mode
 > cpuid = 0; apic id = 00
 > fault virtual address   = 0x38
 > fault code              = supervisor read, page not present
 > instruction pointer     = 0x20:0xc058ec16
 > stack pointer           = 0x28:0xfb7b6ac8
 > frame pointer           = 0x28:0xfb7b6ac8
 > code segment            = base 0x0, limit 0xfffff, type 0x1b
 >                         = DPL 0, pres 1, def32 1, gran 1
 > processor eflags        = interrupt enabled, resume, IOPL = 0
 > current process         = 1327 (powerd)
 > 

>From this and the fault address 0x38 above suggests cpufreq(4)
dereferenced a NULL pointer. It seems powered(4) tried to set CPU
frequency and encountered page fault. Full backtrace would be
great help.

 > (the backtrace didn't make it into the text minidump)

Hmm, this looks different issue.

 > 
 > > > If you need more info, I'll crash the machine again, I'm also happy to
 > > > test further patches.
 > > 
 > > Would you enable DDB/KDB in kernel get a backtrace again?
 > 
 > I should have set this up correctly now, but I think the issue is
 > somewhere else. I'll recompile a clean kernel and test this one first.
 > 

I'm not familiar with cpufreq(4) but jhb_at_ may help you.

-- 
Regards,
Pyun YongHyeon
Received on Sun Aug 03 2008 - 23:04:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:33 UTC