Re: fxp0 and vlan panic

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Tue, 8 Feb 2005 14:23:11 -0500
On Tuesday 08 February 2005 02:06 pm, Doug White wrote:
> On Tue, 8 Feb 2005, Gavin Atkinson wrote:
> > Hey,
> >
> > There's an easily reproduceable panic involving configuring vlans on fxp
> > cards.  I've recreated it in single user mode on a top-of-tree -CURRENT
> > machine as well as on a 5.3-STABLE machine.
>
> This is a WITNESS warning, not a panic. Unless you have witness panics
> enabled, which is strongly not recommended unless you're actively
> debugging the issue.

Umm, it's a panic, too.  This warning now gets triggered by witness when we 
get a page fault while holding a lock.  I've thought about just forcing all 
such page faults to be fatal via a patch like this to avoid the spurious LOR:

Index: trap.c
===================================================================
RCS file: /usr/cvs/src/sys/i386/i386/trap.c,v
retrieving revision 1.270
diff -u -r1.270 trap.c
--- trap.c      16 Nov 2004 20:42:31 -0000      1.270
+++ trap.c      19 Jan 2005 22:44:18 -0000
_at__at_ -238,7 +238,7 _at__at_
                 * to the debugger.
                 */
                eva = rcr2();
-               if (td->td_critnest == 0)
+               if (td->td_critnest == 0 && td->td_sleeplocks == NULL)
                        enable_intr();
                else
                        trap_fatal(&frame, eva);

> > Enter full pathname of shell or RETURN for /bin/sh:
> > # ifconfig vlan0 create
> > # ifconfig vlan0 vlan 123 vlandev fxp0
> > # ifconfig vlan0 inet 1.2.3.4
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address   = 0x0
> > fault code              = supervisor write, page not present
> > instruction pointer     = 0x8:0xc051e966
> > stack pointer           = 0x10:0xcbdf390c
> > frame pointer           = 0x10:0xcbdf3918
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> >                         = DPL 0, pres 1, def32 1, gran 1
> > processor eflags        = interrupt enabled, resume, IOPL = 0
> > current process         = 56 (ifconfig)
> > [thread pid 56 tid 100043 ]
> > Stopped at      fxp_mc_setup+0x62:      movw    $0,0(%eax)
> > db>
> > db> tr
> > Tracing pid 56 tid 100043 td 0xc1594450
> > fxp_mc_setup(c15f6000) at fxp_mc_setup+0x62
> > fxp_ioctl(c15f6000,80206931,0) at fxp_ioctl+0x112
> > if_addmulti(c15f6000,cbdf3980,cbdf397c,c1667d48,cbdf3988) at
> > if_addmulti+0x223 vlan_setmulti(c1667c40,cbdf39fc,c060a5d5,c088cd80,40)
> > at vlan_setmulti+0x139 vlan_ioctl(c1733800,80206931,0) at vlan_ioctl+0x3e
> > if_addmulti(c1733800,cbdf3a4c,cbdf3a48,cbdf3a4c,1c) at if_addmulti+0x223
> > in6_addmulti(cbdf3a9c,c1733800,cbdf3a94) at in6_addmulti+0x4c
> > in6_update_ifa(c1733800,cbdf3b9c,0) at in6_update_ifa+0x4ce
> > in6_ifattach_linklocal(c1733800,0) at in6_ifattach_linklocal+0xe5
> > in6_ifattach(c1733800,0,8040691a,8040691a,0) at in6_ifattach+0xa9
> > in6_if_up(c1733800) at in6_if_up+0x13
> > ifioctl(c173da60,8040691a,c1667dc0,c1594450,0) at ifioctl+0x1f8
> > soo_ioctl(c1724708,8040691a,c1667dc0,c14b9780,c1594450) at
> > soo_ioctl+0x2db ioctl(c1594450,cbdf3d14,3,2,282) at ioctl+0x370
> > syscall(2f,2f,2f,80543a0,1) at syscall+0x213
> > Xint0x80_syscall() at Xint0x80_syscall+0x1f
> > --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x280c44f3, esp =
> > 0xbfbfe5cc, ebp = 0xbfbfee18 ---
> >
> > fxp_mc_setup+0x62 seems to correspond to the following code in
> > sys/dev/fxp/if_fxp.c: (line 2554)
> >
> >
> >                 /*
> >                  * Add a NOP command with interrupt so that we are
> > notified * when all TX commands have been processed.
> >                  */
> >                 txp = sc->fxp_desc.tx_last->tx_next;
> >                 txp->tx_mbuf = NULL;
> > -->             txp->tx_cb->cb_status = 0;
> >                 txp->tx_cb->cb_command = htole16(FXP_CB_COMMAND_NOP |
> >                     FXP_CB_COMMAND_S | FXP_CB_COMMAND_I);
> >
> > txp->tx_cb is NULL at this point.  This seems to be because fxp_init()
> > has never been called. (both validated by instrumenting the code in
> > question)
> >
> > Note also that the panic does not seem to occur if you do anything with
> > fxp0 before doing something with the vlans.  For example, assigning it
> > an address, or even just bringing it up seems to prevent the panic.
> >
> > In this situation, where should fxp_init be called from?  Presumably
> > it's not the responsibility of the vlan code - as when it gets called we
> > could already be using the interface and reinitialising it wouldn't be a
> > nice thing to do.  But then, what should be initialising it?
> >
> > And as an aside, is the detour via inet6 correct for what is entirely
> > inet4?

Good analysis, but I don't have any answers for you. :(  Try bugging rwatson_at_ 
maybe? :-)

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Tue Feb 08 2005 - 18:22:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:27 UTC