Re: fdrop_locked() and FILE_LOCK() vs. Giant

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Tue, 17 Jun 2003 13:14:31 -0700 (PDT)
On 17 Jun, Robert Watson wrote:
> 
> On Tue, 17 Jun 2003, Don Lewis wrote:
> 
>> The FILE_LOCK() implementation uses "pool mutex" under the hood, which
>> means it should only be used as a leaf level mutex.  The fdrop_locked() 
>> code wants to be called with FILE_LOCK() held, but the fdrop_locked() 
>> implementation calls mtx_lock(&Giant) before calling FILE_UNLOCK().  In
>> addition to violating the proper usage of the "pool mutex", there is
>> also the potential for a lock order violation.  The close() 
>> implementation grabs Giant and eventually calls fdrop(), which calls
>> FILE_LOCK() immediately before calling fdrop_locked().  If another
>> caller of fdrop_locked() calls FILE_LOCK() without grabbing Giant first,
>> then the lock order will be reversed when fdrop_locked() grabs Giant. 
>> 
>> It looks like fdrop_locked() should require that Giant be grabbed by the
>> caller before fdrop_locked() is called. 
> 
> I've also noticed that the file descriptor lock is held over per-object
> calls to fo_poll(), which currently isn't a big deal for most objects, but
> may be in the future if we have to grab other locks in order to test the
> poll status inside the object.  I run into this problem with the MAC work
> because the vnode label is protected by the vnode lock, which is a
> sleepable lock.  We may want to change label locking in the future to
> avoid this, but in the mean time I get a lot of witness warnings, and
> using a pool mutex for the fd lock may cause lock order problems if there
> is more complex locking.

Does witness even keep track of the pool mutex stuff?  It doesn't seem
to report any lock order problems in the fdrop_locked() case.  I'm
attempting to debug a deadlock problem for someone, and one process is
hung on FILE_LOCK() in fdrop(), but "show witness" in ddb doesn't
mention any "pool mutex" holders.  The process hung in fdrop() got there
by calling close(), which grabs Giant.  Once it wedge, then everything
else on the system stacked up waiting for Giant.

Alfred mentioned that fdrop_locked() can be easily fixed by dropping the
file lock a bit sooner.
Received on Tue Jun 17 2003 - 11:14:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:12 UTC