Re: Bug in kern_umtx.c -- read-write locks

From: Garrett Cooper <yanefbsd_at_gmail.com>
Date: Tue, 2 Feb 2010 21:16:29 -0800
On Tue, Feb 2, 2010 at 8:05 PM, David Xu <davidxu_at_freebsd.org> wrote:
> Justin Teller wrote:
>>
>> I was working on a highly threaded app (125+ threads) that was using
>> the pthread rw locks, and we were stalling at strange times.  After a
>> lot of debugging in our app, we found that a call to
>> pthread_rwlock_wrlock() would sometimes never return -- it seemed like
>> a wakeup was lost.  After we convinced ourselves the bug wasn't in the
>> app's locking code, I started digging into the kernel.  I found that
>> there is an issue where a wakeup can be "lost" when a thread goes to
>> sleep calling pthread_rwlock_wrlock.  The issue is in the file
>> kern_umtx.c in the function do_rw_wrlock(): the code busies the lock
>> before sleeping, but when it tries to set the waiters bit, it's
>> looking at at old value (from the "try-lock" just before the busy).
>> This allows a race where a thread can go to sleep w/o setting the
>> waiters bit.  Then the last thread to unlock won't wakeup the sleeping
>> thread.  The patch below (based off of 8.0 release) fixes my problem
>> for the write lock and should fix the complimentary issue in
>> do_rw_rdlock.
>>
>>  <snip>
>
> Committed, thanks!

    This might be the reason why the pthreaded application I was
working on was crashing when I had it spawn more than 100 threads (I
tried 2k and 20k simple, short-lived threads that used a basic mutex,
and it got into some deadlock state and bombed)... I'll see whether or
not this fixes my issue as well (but FWIW Linux sucked when I ran the
pthreaded app too and was busting up all over the place)...
Thanks!
-Garrett
Received on Wed Feb 03 2010 - 04:16:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:00 UTC