I was working on a highly threaded app (125+ threads) that was using the pthread rw locks, and we were stalling at strange times. After a lot of debugging in our app, we found that a call to pthread_rwlock_wrlock() would sometimes never return -- it seemed like a wakeup was lost. After we convinced ourselves the bug wasn't in the app's locking code, I started digging into the kernel. I found that there is an issue where a wakeup can be "lost" when a thread goes to sleep calling pthread_rwlock_wrlock. The issue is in the file kern_umtx.c in the function do_rw_wrlock(): the code busies the lock before sleeping, but when it tries to set the waiters bit, it's looking at at old value (from the "try-lock" just before the busy). This allows a race where a thread can go to sleep w/o setting the waiters bit. Then the last thread to unlock won't wakeup the sleeping thread. The patch below (based off of 8.0 release) fixes my problem for the write lock and should fix the complimentary issue in do_rw_rdlock. --- kern_umtx.c 2010-02-01 13:03:24.000000000 -0800 +++ /kernel_src_8.0/usr/src/sys/kern/kern_umtx.c 2010-02-01 09:52:18.000000000 -0800 _at__at_ -2461,10 +2461,15 _at__at_ /* grab monitor lock */ umtxq_lock(&uq->uq_key); umtxq_busy(&uq->uq_key); umtxq_unlock(&uq->uq_key); + /* re-read the state, in case it changed between the try-lock above + * and the check below + */ + state = fuword32(__DEVOLATILE(int32_t *, &rwlock->rw_state)); + /* set read contention bit */ while ((state & wrflags) && !(state & URWLOCK_READ_WAITERS)) { oldstate = casuword32(&rwlock->rw_state, state, state | URWLOCK_READ_WAITERS); if (oldstate == state) goto sleep; _at__at_ -2592,10 +2597,15 _at__at_ /* grab monitor lock */ umtxq_lock(&uq->uq_key); umtxq_busy(&uq->uq_key); umtxq_unlock(&uq->uq_key); + + /* re-read the state, in case it changed between the try-lock above + * and the check below + */ + state = fuword32(__DEVOLATILE(int32_t *, &rwlock->rw_state)); while (((state & URWLOCK_WRITE_OWNER) || URWLOCK_READER_COUNT(state) != 0) && (state & URWLOCK_WRITE_WAITERS) == 0) { oldstate = casuword32(&rwlock->rw_state, state, state | URWLOCK_WRITE_WAITERS); if (oldstate == state) Looking thru the open PRs, it looks like it might be the cause of #135673, but I'm not completely sure. Also, I'm pretty new to contributing to the FreeBSD community -- what's the common practice of responding to a PR? Thanks -JustinReceived on Tue Feb 02 2010 - 19:07:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:00 UTC