Re: 6.0-BETA2: taskqueue_drain for if_xl.c:2796

From: Scott Long <scottl_at_samsco.org> Date: Wed, 17 Aug 2005 13:38:12 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:41 UTC

John Baldwin wrote:
> On Tuesday 16 August 2005 01:21 pm, John Baldwin wrote:
> 
>>On Monday 15 August 2005 02:22 am, Scott Long wrote:
>>
>>>John Baldwin wrote:
>>>
>>>>On Thursday 11 August 2005 07:07 pm, Scott Long wrote:
>>>
>>>I still don't see what the point is of having a function that only
>>>removes pending tasks from the queue if it doesn't also drain in
>>>progress tasks.  Simple example:
>>>
>>>CPU1 calls xl_stop.  The XL_LOCK gets taken.  At the same time, CPU2
>>>calls taskqueue_swi, which pops the queue and calls xl_rxeof_task.  It
>>>tries to take the XL_LOCK but instead sleeps (or adaptively spins)
>>>because CPU1 already has it.  CPU1 calls taskqueue_stop, which doesn't
>>>do anything because the task was alrady popped off.
>>>
>>>Now, at some point, CPU1 **MUST** drop the XL_LOCK and let CPU2 continue
>>>that task to completion.  There simply is no way around that, right?  So
>>>why muddle the API with a function that really doesn't solve any
>>>problems.  The argument that being able to call taskqueue_stop with
>>>locks held is a convenience doesn't hold water.  You'll still need to
>>>drop locks in order to be sure that a mess isn't created.
>>
>>Well, you need to think through this longer.  During a normal stop, we
>>don't care if the task runs later if it is just going to stop anyway, which
>>it will with the IFF_DRV_RUNNING checks I added (which it needs anyway even
>>if you don't add a taskqueue_stop()).  See, when CPU2 eventually runs the
>>xl_rxeof_task, it will drop the lock and return instantly because the
>>xl_stop() function clears IFF_DRV_RUNNING in addition to calling
>>taskqueue_stop().
> 
> 
> Actually, I thought through this longer some myself on the way home and 
> because of the RUNNING checks, the taskqueue_stop() in xl_stop() isn't 
> actually needed for correct operation.  It's merely an optimization for the 
> case that the task is sitting in the queue, but the code will work fine 
> without it with the taskqueue_drain() moved to xl_detach() and the 
> IFF_DRV_RUNNING checks added.  Thus, if you just really hate taskqueue_stop() 
> or don't think the optimization is worth it, I can leave it out.  I kind of 
> would like to call it in xl_detach() at least so that detaching xl0 won't be 
> dependent on another driver's task deciding to yield.  However, if I moved it 
> there I could add blocking to it.  I could also just use the existing 
> taskqueue_drain() and not worry about having to wait for other tasks to 
> finish.  I can't call taskqueue_drain() in xl_stop() though since xl_stop() 
> is called indirectly from xl_intr().
> 

You can't force detach while data is in flight, at least not right now.
That is a motivation behind the recent ifnet changes, but it still has a
long way to go.  And regardless of the rest of the stack being able to
handle forced detach, the driver needs to be written in a way so that
it's own resources aren't orphaned, which again means making sure that
blocked threads are allowed to run to completion, which again means not
having the luxury of being fast and loose with locks when trying to
drain those tasks.

I do think that taskqueue_stop() is a muddying of the API.  If anyone
else has a comment on this (BDE?) I would be very interested to hear it.

Scott