Re: 5.1 NFS locking problems

From: Robert Watson <rwatson_at_freebsd.org>
Date: Fri, 27 Jun 2003 09:27:15 -0400 (EDT)
On Fri, 27 Jun 2003, Mark Hannon wrote:

> I have two 5.1-RELEASE boxes with an NFS locking problem.  One box is
> the NFS server and the other the client.  When attempting to login via
> gdm I get: 
> 
> messages:Jun 27 18:09:07 tbird gconfd (mark-2316): Failed to get lock
> for daemon, exiting: Failed to lock '/home/mark/.gconfd/lock/ior':
> probably another process has the lock, or your operating system has NFS
> file locking misconfigured (Resource temporarily unavailable) 
> 
> rpc.statd and rpc.lockd are running on both the server and client. 
> 
> Any ideas how to trace? 

The first thing I'd probably do is ktrace gdm (make sure to use the
descend flag) and see what the system call and arguments are that generate
this error.  Generally, locks are asserted using one of the following
system calls: open() with a lock flag, flock(), or fcntl().  Grep the
ktrace output looking for the return of EAGAIN.

Off the top of my head, the most likely source of EAGAIN is open() with
the O_NONBLOCK flag set, which indicates that the caller doesn't want to
wait for the lock to become available if the lock is contended.  In which
ase it sounds like it's an application "feature" (hence the message
reading "probably another process has the lock...", which sounds right). 

You can use:

  http://www.watson.org/~robert/freebsd/locktest.c

to test lock contention and servicing; it's basically a wrapper around
open() and flock() so you can easily specify various cases, sock as
O_NONBLOCK, etc, to try to reproduce the exact arguments and flags you see
in the ktrace. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Network Associates Laboratories
Received on Fri Jun 27 2003 - 04:27:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:13 UTC