Re: 6.0-BETA2: taskqueue_drain for if_xl.c:2796

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Wed, 17 Aug 2005 15:09:47 -0400
On Wednesday 17 August 2005 01:38 pm, Scott Long wrote:
> John Baldwin wrote:
> > On Tuesday 16 August 2005 01:21 pm, John Baldwin wrote:
> >>On Monday 15 August 2005 02:22 am, Scott Long wrote:
> >>>John Baldwin wrote:
> >>>>On Thursday 11 August 2005 07:07 pm, Scott Long wrote:
> >>>
> >>>I still don't see what the point is of having a function that only
> >>>removes pending tasks from the queue if it doesn't also drain in
> >>>progress tasks.  Simple example:
> >>>
> >>>CPU1 calls xl_stop.  The XL_LOCK gets taken.  At the same time, CPU2
> >>>calls taskqueue_swi, which pops the queue and calls xl_rxeof_task.  It
> >>>tries to take the XL_LOCK but instead sleeps (or adaptively spins)
> >>>because CPU1 already has it.  CPU1 calls taskqueue_stop, which doesn't
> >>>do anything because the task was alrady popped off.
> >>>
> >>>Now, at some point, CPU1 **MUST** drop the XL_LOCK and let CPU2 continue
> >>>that task to completion.  There simply is no way around that, right?  So
> >>>why muddle the API with a function that really doesn't solve any
> >>>problems.  The argument that being able to call taskqueue_stop with
> >>>locks held is a convenience doesn't hold water.  You'll still need to
> >>>drop locks in order to be sure that a mess isn't created.
> >>
> >>Well, you need to think through this longer.  During a normal stop, we
> >>don't care if the task runs later if it is just going to stop anyway,
> >> which it will with the IFF_DRV_RUNNING checks I added (which it needs
> >> anyway even if you don't add a taskqueue_stop()).  See, when CPU2
> >> eventually runs the xl_rxeof_task, it will drop the lock and return
> >> instantly because the xl_stop() function clears IFF_DRV_RUNNING in
> >> addition to calling taskqueue_stop().
> >
> > Actually, I thought through this longer some myself on the way home and
> > because of the RUNNING checks, the taskqueue_stop() in xl_stop() isn't
> > actually needed for correct operation.  It's merely an optimization for
> > the case that the task is sitting in the queue, but the code will work
> > fine without it with the taskqueue_drain() moved to xl_detach() and the
> > IFF_DRV_RUNNING checks added.  Thus, if you just really hate
> > taskqueue_stop() or don't think the optimization is worth it, I can leave
> > it out.  I kind of would like to call it in xl_detach() at least so that
> > detaching xl0 won't be dependent on another driver's task deciding to
> > yield.  However, if I moved it there I could add blocking to it.  I could
> > also just use the existing taskqueue_drain() and not worry about having
> > to wait for other tasks to finish.  I can't call taskqueue_drain() in
> > xl_stop() though since xl_stop() is called indirectly from xl_intr().
>
> You can't force detach while data is in flight, at least not right now.
> That is a motivation behind the recent ifnet changes, but it still has a
> long way to go.  And regardless of the rest of the stack being able to
> handle forced detach, the driver needs to be written in a way so that
> it's own resources aren't orphaned, which again means making sure that
> blocked threads are allowed to run to completion, which again means not
> having the luxury of being fast and loose with locks when trying to
> drain those tasks.

I'm not running "fast and loose" with locks per se.  Mostly I'm just 
optimizing the case where we stop the interface (i.e. detach or ifconfig 
down) and the task hasn't even started running yet so that it can just 
discard the task from the queue without running it rather than waiting until 
it gets a chance to run so it can immediately abort.

> I do think that taskqueue_stop() is a muddying of the API.  If anyone
> else has a comment on this (BDE?) I would be very interested to hear it.

*shrug*  It's already confusing in that callout_drain() does both a stop and 
then wait kind of like pthread_cancel() whereas taskqueue_drain() just waits 
until the task has run and finished without trying to cancel it like 
pthread_join().  I'd actually probably be inclined to rename callout_drain() 
to something else if we wanted to resolve that specific instance of confusion 
personally.

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Wed Aug 17 2005 - 17:11:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:41 UTC