Re: mutex sleepq chain not owned at /usr/src/sys/kern/subr_sleepqueue.c

From: Attilio Rao <attilio_at_freebsd.org>
Date: Tue, 10 Jun 2008 21:44:37 +0200
2008/6/9, kevinxlinuz <kevinxlinuz_at_163.com>:
> Recently I meet a problem in freebsd 8.0/amd64.
>  See PR/124200
>  http://www.freebsd.org/cgi/query-pr.cgi?pr=124200&cat=
>
>  I try to find the reason.
>
>  in cv_broadcastpri(...),it call sleepq_lock(cvp),next it call sleepq_broadcast(cvp, SLEEPQ_CONDVAR, pri, 0).
>  in sleepq_broadcast(void *wchan, int flags, int pri, int queue),sleepqueue sq = sleepq_lookup(wchan)  /* here wchan will be checked,and sq->sq_wchan == wchan == cvp (passed from cv_broadcastpri())*/;
>  I add mtx_assert in /usr/src/sys/kern/subr_sleepqueue.c
>  sleepq_broadcast(void *wchan, int flags, int pri, int queue)
>  {
>         struct sleepqueue *sq;
>         struct thread *td;
>
>         struct sleepqueue_chain *sc;
>
>         CTR2(KTR_PROC, "sleepq_broadcast(%p, %d)", wchan, flags);
>         KASSERT(wchan != NULL, ("%s: invalid NULL wait channel", __func__));
>         MPASS((queue >= 0) && (queue < NR_SLEEPQS));
>         sq = sleepq_lookup(wchan);   //wchan == cvp, cvp from cv_broadcastpri(...),and sleepq_lock(cvp)
>        //here sq->sq_wchan == wchan == cvp
>         if (sq == NULL)
>                 return;
>         KASSERT(sq->sq_type == (flags & SLEEPQ_TYPE),
>             ("%s: mismatch between sleep/wakeup and cv_*", __func__));
>
>         /* Resume all blocked threads on the sleep queue. */
>         while (!TAILQ_EMPTY(&sq->sq_blocked[queue])) {
>                 td = TAILQ_FIRST(&sq->sq_blocked[queue]);
>                 thread_lock(td);
>         /*    ------test start---------- */
>                 sc = SC_LOOKUP(sq->sq_wchan);   //sq->sq_wchan should be wchan
>                 mtx_assert(&sc->sc_lock, MA_OWNED);   //panic here,sq->sq_wchan != wchan ? or sleepq_unlock(wchan) was called by others
>        /*    -----test end----- */
>                 sleepq_resume_thread(sq, td, pri);
>                 thread_unlock(td);
>         }
>  }
>

Hello,
We are trying to track this down but things go very slowly because I
can't reproduce the bug.
I would need you try some diagnostic patches, do you think you can
work on that with me? Can you reproduce easilly the bug?

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Tue Jun 10 2008 - 17:44:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:31 UTC