Re: somewhat reproducable vimage panic

From: John-Mark Gurney <jmg_at_funkthat.com>
Date: Wed, 22 Jul 2020 15:15:09 -0700
Bjoern A. Zeeb wrote this message on Wed, Jul 22, 2020 at 20:43 +0000:
> On 22 Jul 2020, at 19:34, John-Mark Gurney wrote:
> 
> > John-Mark Gurney wrote this message on Tue, Jul 21, 2020 at 23:05 
> > -0700:
> >> Peter Libassi wrote this message on Wed, Jul 22, 2020 at 06:54 +0200:
> >>> Is this related to
> >>>
> >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234985 
> >>> <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234985> and 
> >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238326 
> >>> <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238326>
> >>
> >> Definitely not 234985..  I'm using ue interfaces, and so they don't
> >> get destroyed while the jail is going away...
> >>
> >> I don't think it's 238326 either.  This is 100% reliable and it's in
> >> the IP multicast code..  It looks like in_multi isn't holding an
> >> interface or address lock waiting for things to free up...
> >
> > Did a little more poking, and it looks like the vnet is free'd before
> > the ifnet is free'd causing this problem:
> > (kgdb) print inm->inm_ifp[0].if_refcount
> > $5 = 1
> > (kgdb) print inm->inm_ifp[0].if_vnet[0]
> > $6 = {vnet_le = {le_next = 0xdeadc0dedeadc0de, le_prev = 
> > 0xdeadc0dedeadc0de},
> >   vnet_magic_n = 3735929054, vnet_ifcnt = 3735929054,
> >   vnet_sockcnt = 3735929054, vnet_state = 3735929054,
> >   vnet_data_mem = 0xdeadc0dedeadc0de, vnet_data_base = 
> > 16045693110842147038,
> >   vnet_shutdown = 222}
> >
> > So the multicast code is fine, it holds and releases a reference to
> > ifnet..
> >
> > The issue is that the reference to the ifnet doesn't involve a
> > reference to the vnet/prison.
> 
> Does it need to?  The ifnet cannot go away while something holds a 
> reference to it, right?

It's the other way around that's the problem.. the ifnet is holding an
invalid vnet pointer that got free'd.

Maybe the problem isn't the tear down, but that the vnet pointer isn't
changed/restored before the free?

> Sounds more like the teardown order is wrong (again)?
> 
> There should be no more multicast when IP etc. is gone.  That means MC 
> doesn???t properly cleanup itself.

Don't know, just know that it's easy to trigger right now...  I haven't
tested on prior releases, but if you'd like me to, it isn't too hard for
me to test...

> I guess I should go back now and re-read your original problem statement 
> on how you trigger this..

So, it's pretty easy to trigger, just attach a couple USB ethernet
adapters, in my case, they were ure, but likely any two spare ethernet
interfaces will work, and wire them back to back..

Run the script attached earlier in the thread, providing it the name
of the two interfaces as arguments, and run it a few times.  You might
get failures or not.  It shouldn't matter.  After a few runs, it'll
panic...

I just tested this (to make sure my ure changes weren't causing addition
problems) using
FreeBSD-13.0-CURRENT-amd64-20200625-r362596-memstick.img.xz, so it's
stock reproducable.

Thanks for looking into this!

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
Received on Wed Jul 22 2020 - 20:15:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC