On Tue, Feb 2, 2010 at 8:05 PM, David Xu <davidxu_at_freebsd.org> wrote: > Justin Teller wrote: >> >> I was working on a highly threaded app (125+ threads) that was using >> the pthread rw locks, and we were stalling at strange times. After a >> lot of debugging in our app, we found that a call to >> pthread_rwlock_wrlock() would sometimes never return -- it seemed like >> a wakeup was lost. After we convinced ourselves the bug wasn't in the >> app's locking code, I started digging into the kernel. I found that >> there is an issue where a wakeup can be "lost" when a thread goes to >> sleep calling pthread_rwlock_wrlock. The issue is in the file >> kern_umtx.c in the function do_rw_wrlock(): the code busies the lock >> before sleeping, but when it tries to set the waiters bit, it's >> looking at at old value (from the "try-lock" just before the busy). >> This allows a race where a thread can go to sleep w/o setting the >> waiters bit. Then the last thread to unlock won't wakeup the sleeping >> thread. The patch below (based off of 8.0 release) fixes my problem >> for the write lock and should fix the complimentary issue in >> do_rw_rdlock. >> >> <snip> > > Committed, thanks! This might be the reason why the pthreaded application I was working on was crashing when I had it spawn more than 100 threads (I tried 2k and 20k simple, short-lived threads that used a basic mutex, and it got into some deadlock state and bombed)... I'll see whether or not this fixes my issue as well (but FWIW Linux sucked when I ran the pthreaded app too and was busting up all over the place)... Thanks! -GarrettReceived on Wed Feb 03 2010 - 04:16:30 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:00 UTC