You guys should really look at Samy Bahra's epoch based reclamation. I solved a similar problem in drm/linuxkpi using it. -M ---- On Tue, 28 Jun 2016 02:58:56 -0700 Julien Charbon <jch_at_freebsd.org> wrote ---- > > Hi Randall, > > On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote: > > Ok > > > > Lets try this again with my source changed to my _at_freebsd.net :-) > > > > Now I am also attaching a patch for you Gleb, this will take some poking to > > get in to your NF-head since it incorporates some changes we made earlier. > > > > I think this will fix the problem.. i.e. dealing with two locks in the callout system (which it was > > never meant to have done).. > > > > Note we probably can move the code to use the callout lock init now.. but lets see if this works > > on your setup on c096 and if so we can think about doing that. > > Thanks for proposing a patch. I believe your patch will work with > callout lock init, but not without: You still have a use-after-free > issue on the tcpcb without callout lock init. > > The case being subtle as usual, let me try to describe that could happen: > > With your patch we have: > > void > tcp_timer_keep(void *xtp) > { > struct tcpcb *tp = xtp; > struct tcptemp *t_template; > struct inpcb *inp; > CURVNET_SET(tp->t_vnet); > #ifdef TCPDEBUG > int ostate; > > ostate = tp->t_state; > #endif > inp = tp->t_inpcb; > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, > tp)); > INP_WLOCK(inp); > if (callout_pending(&tp->t_timers->tt_keep) ### Use after free > of tp here > !callout_active(&tp->t_timers->tt_keep)) { > INP_WUNLOCK(inp); > CURVNET_RESTORE(); > return; > } > ... > > The use-after-free scenario: > > [CPU 1] the callout fires, tcp_timer_keep entered > [CPU 1] blocks on INP_WLOCK(inp); > [CPU 2] schedules tcp_timer_keep with callout_reset() > [CPU 2] tcp_discardcb called > [CPU 2] tcp_timer_keep callout successfully canceled > [CPU 2] tcpcb freed > [CPU 1] unblocks, the tcpcb is used > > Then the tcpcb will used just after being freed... Might also crash or > not depending in the case. > > Extra notes: > > o The invariant I see here is: The "callout successfully canceled" > step should never happen when "the callout is currently being executed". > > o Solutions I see to enforce this invariant: > > - First solution: Use callout lock init with inp lock, your patch > seems to permit that now. > > - Second solution: Change callout_async_drain() behavior: It can > return 0 (fail) when the callout is currently being executed (no matter > what). > > - Third solution: Don't trust callout_async_drain(callout) return > value of 1 (success) if the previous call of callout_reset(callout) > returned 0 (fail). That was the exact purpose of r284261 change, but > this solution is also step backward in modernization of TCP > timers/callout... > > https://svnweb.freebsd.org/base/stable/10/sys/netinet/tcp_timer.c?r1=284261&r2=284260&pathrev=284261 > > Hopefully my description is clear enough... > > -- > Julien > >Received on Tue Jun 28 2016 - 15:51:44 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:06 UTC