On Thu, Sep 28, 2006 at 01:09:05PM +0200, Michiel Boland wrote: > -CURRENT from 25 Sept. (if_em.c has rev 1.147) > > em, connected to cisco 2950 at 100 Mb full/duplex with TSO disabled. > > At high load, the card stopped passing network traffic. After I > ifconfig-ed the interface down and up again, I got this panic. > > Obviously neither the network card malfunction or the panic are any good. > I hope someone can figure out what's going on. > > Cheers > Michiel > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x568 > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc0464d9a > stack pointer = 0x28:0xd3358c50 > frame pointer = 0x28:0xd3358c64 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 11 (swi4: clock sio) > trap number = 12 > panic: page fault > KDB: stack backtrace: > kdb_backtrace(100,c20736c0,28,d3358c10,c,...) at kdb_backtrace+0x29 > panic(c063a952,c065b591,0,0,fffff,...) at panic+0xa8 > trap_fatal(d3358c10,568,c20736c0,c069d0a0,0,...) at trap_fatal+0x2b6 > trap_pfault(d3358c10,0,568) at trap_pfault+0x1cb > trap(d3350008,c04f0028,c2150028,568,ad,...) at trap+0x38d > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0xc0464d9a, esp = 0xd3358c50, ebp = 0xd3358c64 --- > em_txeof(c20f1000) at em_txeof+0x86 > em_watchdog(c2131000) at em_watchdog+0xa6 > if_slowtimo(0) at if_slowtimo+0x66 > softclock(0) at softclock+0x252 > ithread_execute_handlers(c2072b04,c2070500) at > ithread_execute_handlers+0x125 > ithread_loop(c20426c0,d3358d38) at ithread_loop+0x54 > fork_exit(c04cea10,c20426c0,d3358d38) at fork_exit+0x7a > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xd3358d6c, ebp = 0 --- > Uptime: 2d23h21m50s > Physical memory: 505 MB > Dumping 117 MB: 102 86 (CTRL-C to abort) 70 54 38 22 (CTRL-C to abort) > (CTRL-C to abort) 6 > > #0 doadump () at pcpu.h:166 > 166 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:166 > #1 0xc04e3ca4 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc04e3f6c in panic (fmt=0xc063a952 "%s") at > /usr/src/sys/kern/kern_shutdown.c:565 > #3 0xc0616d0a in trap_fatal (frame=0xd3358c10, eva=1384) at > /usr/src/sys/i386/i386/trap.c:867 > #4 0xc0616a2b in trap_pfault (frame=0xd3358c10, usermode=0, eva=1384) at > /usr/src/sys/i386/i386/trap.c:776 > #5 0xc0616625 in trap (frame= > {tf_fs = -751501304, tf_es = -1068564440, tf_ds = -1038811096, tf_edi > = 1384, tf_esi = 173, tf_ebp = -751465372, tf_isp = -751465412, > tf_ebx = -1038800176, tf_edx = -1039200256, tf_ecx = -865996036, > tf_eax = 2768, tf_trapno = 12, tf_err = 0, tf_eip = -1069134438, > tf_cs = 32, tf_eflags = 66054, tf_esp = -1038938112, tf_ss = 231}) at > /usr/src/sys/i386/i386/trap.c:461 > #6 0xc060759a in calltrap () at /usr/src/sys/i386/i386/exception.s:138 > #7 0xc0464d9a in em_txeof (adapter=0xc20f1000) at > /usr/src/sys/dev/em/if_em.c:2956 > #8 0xc0461ace in em_watchdog (ifp=0xc2131000) at > /usr/src/sys/dev/em/if_em.c:963 > #9 0xc05576de in if_slowtimo (arg=0x0) at /usr/src/sys/net/if.c:1415 > #10 0xc04f1ac2 in softclock (dummy=0x0) at > /usr/src/sys/kern/kern_timeout.c:271 > #11 0xc04ce955 in ithread_execute_handlers (p=0xc2072b04, ie=0xc2070500) at > /usr/src/sys/kern/kern_intr.c:662 > #12 0xc04cea64 in ithread_loop (arg=0xc20426c0) at > /usr/src/sys/kern/kern_intr.c:745 > #13 0xc04cd8b6 in fork_exit (callout=0xc04cea10 <ithread_loop>, > arg=0xc20426c0, frame=0xd3358d38) at /usr/src/sys/kern/kern_fork.c:818 > #14 0xc06075fc in fork_trampoline () at > /usr/src/sys/i386/i386/exception.s:199 > (kgdb) f 7 > #7 0xc0464d9a in em_txeof (adapter=0xc20f1000) at > /usr/src/sys/dev/em/if_em.c:2956 > 2956 num_avail++; > (kgdb) info locals > i = 173 > num_avail = 231 > tx_buffer = (struct em_buffer *) 0x568 > tx_desc = (struct em_tx_desc *) 0xc2152ad0 > ifp = (struct ifnet *) 0xc2131000 As Jack said I can't sure how tx_buffer can have bogus value. Since switching to adaptive polling on em(4) em_rxeof() runs without locks held. But if you force interface down while em_rxeof() is in prgoress it would corrupt softc/hardware. It's just vague guess since no other users reported this kind of issues. Removing em_txeof() in em_watchdog() may help for your case(eventually, em_watchdog() will reset hardware) but I don't think it's correct fix for root cause. -- Regards, Pyun YongHyeonReceived on Thu Sep 28 2006 - 22:06:34 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:00 UTC