Re: -CURRENT Panic at boot at Revision: 237264 "mutex gif softc not owned at /usr/src/sys/netinet/in_gif.c:105"

From: John Baldwin <jhb_at_freebsd.org>
Date: Thu, 21 Jun 2012 14:34:36 -0400
On Thursday, June 21, 2012 12:41:59 pm Vincent Hoffman wrote:
> Hi again,
>                 The 2nd patch (to if.h and if_gif.c) also fixes the
> panic on boot.
> Thanks for the quick response as ever.

Great, thanks for testing!  Randall, do you have any thoughts on these 
patches?

> Vince
> 
> 
> On 20/06/2012 13:12, John Baldwin wrote:
> > On Tuesday, June 19, 2012 8:05:36 pm Vincent Hoffman wrote:
> >> Full dump info at http://unsane.co.uk/crash
> >> It seems to have popped up between r236905 (working kernel) and r237264
> >> (this panic)
> >>
> >> the gif config I have in rc.conf is for a HE ipv6 tunnel
> > Looks like this was broken in r236951 by Randall (cc'd).
> >
> > I think this would fix it:
> >
> > Index: if_gif.c
> > ===================================================================
> > --- if_gif.c	(revision 237227)
> > +++ if_gif.c	(working copy)
> > _at__at_ -366,11 +366,12 _at__at_ gif_start(struct ifnet *ifp)
> >  		return;
> >  	}
> >  	ifp->if_drv_flags |= IFF_DRV_OACTIVE;
> > -	GIF_UNLOCK(sc);
> >  keep_going:
> >  	while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
> >  
> > +		GIF_UNLOCK(sc);
> >  		IFQ_DRV_DEQUEUE(&ifp->if_snd, m);
> > +		GIF_LOCK(sc);
> >  		if (m == 0)
> >  			break;
> >  
> > _at__at_ -424,14 +425,12 _at__at_ keep_going:
> >  			ifp->if_oerrors++;
> >  
> >  	}
> > -	GIF_LOCK(sc);
> >  	if (ifp->if_drv_flags & IFF_GIF_WANTED) {
> >  		/* Someone did a start while
> >  		 * we were unlocked and processing
> >  		 * lets clear the flag and try again.
> >  		 */
> >  		ifp->if_drv_flags &= ~IFF_GIF_WANTED;
> > -		GIF_UNLOCK(sc);
> >  		goto keep_going;
> >  	}
> >  	ifp->if_drv_flags &= ~IFF_DRV_OACTIVE;
> >
> > However, unless there is a known LOR, I would be inclined to
> > just hold the lock across IFQ_DRV_DEQUEUE() and dispense with
> > all the 'keep_going', etc. logic.  Other NIC drivers tend to
> > just hold their transmit lock for the entire loop in their
> > start routines.
> >
> > That would look like this:
> >
> > Index: if.h
> > ===================================================================
> > --- if.h	(revision 237227)
> > +++ if.h	(working copy)
> > _at__at_ -153,7 +153,6 _at__at_
> >  #define	IFF_STATICARP	0x80000		/* (n) static ARP */
> >  #define	IFF_DYING	0x200000	/* (n) interface is winding down */
> >  #define	IFF_RENAMING	0x400000	/* (n) interface is being renamed 
*/
> > -#define IFF_GIF_WANTED	0x1000000	/* (n) The gif tunnel is wanted */
> >  /*
> >   * Old names for driver flags so that user space tools can continue to 
use
> >   * the old (portable) names.
> > Index: if_gif.c
> > ===================================================================
> > --- if_gif.c	(revision 237227)
> > +++ if_gif.c	(working copy)
> > _at__at_ -359,15 +359,7 _at__at_
> >  
> >  	sc = ifp->if_softc;
> >  	GIF_LOCK(sc);
> > -	if (ifp->if_drv_flags & IFF_DRV_OACTIVE) {
> > -		/* Already active */
> > -		ifp->if_drv_flags |= IFF_GIF_WANTED;
> > -		GIF_UNLOCK(sc);
> > -		return;
> > -	}
> >  	ifp->if_drv_flags |= IFF_DRV_OACTIVE;
> > -	GIF_UNLOCK(sc);
> > -keep_going:
> >  	while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
> >  
> >  		IFQ_DRV_DEQUEUE(&ifp->if_snd, m);
> > _at__at_ -424,16 +416,6 _at__at_
> >  			ifp->if_oerrors++;
> >  
> >  	}
> > -	GIF_LOCK(sc);
> > -	if (ifp->if_drv_flags & IFF_GIF_WANTED) {
> > -		/* Someone did a start while
> > -		 * we were unlocked and processing
> > -		 * lets clear the flag and try again.
> > -		 */
> > -		ifp->if_drv_flags &= ~IFF_GIF_WANTED;
> > -		GIF_UNLOCK(sc);
> > -		goto keep_going;
> > -	}
> >  	ifp->if_drv_flags &= ~IFF_DRV_OACTIVE;
> >  	GIF_UNLOCK(sc);
> >  	return;
> >
> > I would prefer this latter patch if it is ok as it makes the code simpler.
> > Also, IFF_GIF_WANTED as a new iff flag seems really hackish.  IFF_* flags
> > are supposed to be interface independent.  A flag like that should be in a
> > private field in the gif softc, not something exposed to the entire 
system.
> >
> >> cloned_interfaces="gif0"
> >> ifconfig_gif0="tunnel 85.233.185.162 216.66.80.26"
> >> ifconfig_gif0_ipv6="inet6 2001:470:1f08:110::2 2001:470:1f08:110::1
> >> prefixlen 128 -accept_rtadv"
> >>
> >> src.conf only has
> >> WITHOUT_IPFILTER=true
> >> WITHOUT_KERBEROS=true
> >> WITHOUT_PROFILE=yes
> >>
> >> Happy to provide any more info as needed. any suggestions welcome, I'll
> >> see if I can track it further with a binary search tomorrow.
> >>
> >>
> >> >From dump info file (at above URL)
> >> #0  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266
> >> 266             if (textdump && textdump_pending) {
> >> (kgdb) #0  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266
> >> #1  0xffffffff80314740 in db_dump (dummy=Variable "dummy" is not 
available.
> >> )
> >>     at /usr/src/sys/ddb/db_command.c:538
> >> #2  0xffffffff80313d31 in db_command (last_cmdp=0xffffffff80c52b40,
> >> cmd_table=Variable "cmd_table" is not available.
> >>
> >> ) at /usr/src/sys/ddb/db_command.c:449
> >> #3  0xffffffff80313f80 in db_command_loop ()
> >>     at /usr/src/sys/ddb/db_command.c:502
> >> #4  0xffffffff803160d9 in db_trap (type=Variable "type" is not available.
> >> ) at /usr/src/sys/ddb/db_main.c:231
> >> #5  0xffffffff80590918 in kdb_trap (type=3, code=0, 
tf=0xffffff80ea22ee20)
> >>     at /usr/src/sys/kern/subr_kdb.c:654
> >> #6  0xffffffff80815c9d in trap (frame=0xffffff80ea22ee20)
> >>     at /usr/src/sys/amd64/amd64/trap.c:573
> >> #7  0xffffffff807ffe63 in calltrap ()
> >>     at /usr/src/sys/amd64/amd64/exception.S:228
> >> #8  0xffffffff8059039b in kdb_enter (why=0xffffffff808fac8a "panic",
> >>     msg=0x80 <Address 0x80 out of bounds>) at cpufunc.h:63
> >> #9  0xffffffff805581f1 in panic (fmt=Variable "fmt" is not available.
> >> )
> >>     at /usr/src/sys/kern/kern_shutdown.c:628
> >> #10 0xffffffff805454ec in _mtx_assert (m=Variable "m" is not available.
> >> )
> >>     at /usr/src/sys/kern/kern_mutex.c:747
> >> #11 0xffffffff8067bcf6 in in_gif_output (ifp=0xfffffe0005e28000, 
family=28,
> >>     m=0xfffffe0005ff8300) at /usr/src/sys/netinet/in_gif.c:105
> >> #12 0xffffffff8061d6a2 in gif_start (ifp=0xfffffe0005e28000)
> >>     at /usr/src/sys/net/if_gif.c:411
> >> #13 0xffffffff8061cbd4 in gif_output (ifp=0xfffffe0005e28000, m=Variable
> >> "m" is not available.
> >> )
> >>     at /usr/src/sys/net/if_gif.c:540
> >> #14 0xffffffff807290c7 in nd6_output_lle (ifp=0xfffffe0005e28000,
> >>     origifp=0xfffffe0005e28000, m0=0xfffffe0005ff8300,
> >>     dst=0xffffff80ea22f56c, rt0=Variable "rt0" is not available.
> >> ) at /usr/src/sys/netinet6/nd6.c:2079
> >> #15 0xffffffff807292f8 in nd6_output (ifp=Variable "ifp" is not 
available.
> >> )
> >>     at /usr/src/sys/netinet6/nd6.c:1824
> >> #16 0xffffffff80723171 in ip6_output (m0=Variable "m0" is not available.
> >> )
> >>     at /usr/src/sys/netinet6/ip6_output.c:1021
> >> #17 0xffffffff8072cf9f in nd6_ns_output (ifp=0xfffffe0005e28000,
> >> daddr6=0x0,
> >>     taddr6=0xfffffe0005300318, ln=Variable "ln" is not available.
> >> ) at /usr/src/sys/netinet6/nd6_nbr.c:593
> >> #18 0xffffffff8072d801 in nd6_dad_start (ifa=0xfffffe0005300200, delay=0)
> >>     at /usr/src/sys/netinet6/nd6_nbr.c:1298
> >> #19 0xffffffff80710448 in in6_update_ifa (ifp=0xfffffe0005e28000,
> >>     ifra=0xfffffe00812c8b00, ia=0xfffffe0005300200, flags=Variable
> >> "flags" is not available.
> >> )
> >>     at /usr/src/sys/netinet6/in6.c:1298
> >> #20 0xffffffff80711658 in in6_control (so=0xfffffe00810c5aa0,
> >> cmd=2156423451,
> >>     data=0xfffffe00812c8b00 "gif0", ifp=0xfffffe0005e28000,
> >>     td=0xfffffe0005009000) at /usr/src/sys/netinet6/in6.c:654
> >> #21 0xffffffff806181f6 in ifioctl (so=0xfffffe00810c5aa0, cmd=2156423451,
> >>     data=0xfffffe00812c8b00 "gif0", td=0xfffffe0005009000)
> >>     at /usr/src/sys/net/if.c:2540
> >> #22 0xffffffff805aa0dd in kern_ioctl (td=Variable "td" is not available.
> >> ) at file.h:287
> >> #23 0xffffffff805aa37d in sys_ioctl (td=0xfffffe0005009000,
> >>     uap=0xffffff80ea22fb70) at /usr/src/sys/kern/sys_generic.c:691
> >> #24 0xffffffff80814a34 in amd64_syscall (td=0xfffffe0005009000, traced=0)
> >>     at subr_syscall.c:135
> >> #25 0xffffffff80800147 in Xfast_syscall ()
> >>     at /usr/src/sys/amd64/amd64/exception.S:387
> >> #26 0x0000000801183d0c in ?? ()
> >> Previous frame inner to this frame (corrupt stack?)
> >> (kgdb)
> >>
> >>
> >>
> >> _______________________________________________
> >> freebsd-current_at_freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> >> To unsubscribe, send any mail to "freebsd-current-
unsubscribe_at_freebsd.org"
> >>
> 
> 
> 

-- 
John Baldwin
Received on Thu Jun 21 2012 - 18:42:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:28 UTC