Re: [BUG] I think sleepqueue need to be protected in sleepq_broadcast

From: Attilio Rao <attilio_at_freebsd.org>
Date: Fri, 5 Sep 2008 01:38:37 +0200
2008/9/2, John Baldwin <jhb_at_freebsd.org>:
> On Sunday 31 August 2008 09:31:17 pm Tor Egge wrote:
>  >
>
> > sleepq_resume_thread() contains an ownership handover of sq if the resumed
>  > thread is the last one blocked on the wait channel.  After the handover, sq
>  is
>  > no longer protected by the sleep queue chain lock and should no longer be
>  > accessed by sleepq_broadcast().
>  >
>  > Normally, when sleepq_broadcast() incorrectly accesses sq after the
>  handover,
>  > it will find the sq->sq_blocked queue to be empty, and the code appears to
>  > work.
>  >
>  > If the last correctly woken thread manages to go to sleep again very quickly
>  on
>  > another wait channel, sleepq_broadcast() might incorrectly determine that
>  the
>  > sq->sq_blocked queue isn't empty, and start doing the wrong thing.
>
>
> So disregard my earlier e-mail.  Here is a simple fix for the sleepq case:
>
>  Index: subr_sleepqueue.c
>  ===================================================================
>  --- subr_sleepqueue.c   (revision 182679)
>  +++ subr_sleepqueue.c   (working copy)
>  _at__at_ -779,7 +779,7 _at__at_
>
>  sleepq_broadcast(void *wchan, int flags, int pri, int queue)
>   {
>         struct sleepqueue *sq;
>  -       struct thread *td;
>
> +       struct thread *td, *tdn;
>
>         int wakeup_swapper;
>
>         CTR2(KTR_PROC, "sleepq_broadcast(%p, %d)", wchan, flags);
>
> _at__at_ -793,8 +793,7 _at__at_
>
>
>         /* Resume all blocked threads on the sleep queue. */
>         wakeup_swapper = 0;
>  -       while (!TAILQ_EMPTY(&sq->sq_blocked[queue])) {
>  -               td = TAILQ_FIRST(&sq->sq_blocked[queue]);
>
> +       TAILQ_FOREACH_SAFE(td, &sq->sq_blocked[queue], td_slpq, tdn) {
>                 thread_lock(td);
>
>                 if (sleepq_resume_thread(sq, td, pri))
>                         wakeup_swapper = 1;
>
>
> This only uses 'sq' to fetch the head of the queue once up front.  It won't
>  use it again once it has started waking up threads.
>
>
>  > A similar (but probably much more difficult to trigger) issue is present
>  with
>  > regards to thread_lock() and turnstiles.
>  >
>  > The caller of thread_lock() might have performed sufficient locking to
>  ensure
>  > that the thread to be locked doesn't go away, but any turnstile spin lock
>  > pointed to by td->td_lock isn't protected.  Making turnstiles type stable
>  > (setting UMA_ZONE_NOFREE flag for turnstile_zone) should fix that issue.
>
>
> Note that unlike the sleepq case, turnstiles are not made runnable until all
>  of them are dequeued from the turnstile and assigned a new turnstile.  Only
>  after all that is settled are the threads made runnable in turnstile_unpend().
>
>  However, that doesn't fix this specific race (though it means the turnstile
>  code is not subject to the same exact race as the sleepq code above).  Making
>  turnstiles type-stable is indeed probably the only fix for this. :-/

What about setting a flag meaning "turnstile recycled but still used"
and allow turnstile_wait() to spin until it gets unset before to
access to td_turnstile?

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Thu Sep 04 2008 - 21:38:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:34 UTC