Re: rpc.lockd spinning; much breakage

From: Andrew P. Lentvorski, Jr. <bsder_at_allcaps.org>
Date: Tue, 13 May 2003 01:28:58 -0700 (PDT)
On Mon, 12 May 2003, Robert Watson wrote:

> (3) Sometimes rpc.lockd on 5.x acting as a server gets really confused
>     when you mix local and remote locks.  I haven't quite figured out the
>     circumstances, but occasionally I run into a situation where a client
>     contends against an existing lock on the server, and the client never
>     receives a notification from the server that the lock has been
>     released.  It looks like the server stores state that the lock is
>     contended, but perhaps never properly re-polls the kernel to see if
>     the lock has been locally re-released:

I just looked at the code again.  rpc.lockd does not spawn off extra
processes to continuously poll the kernel.  It assumes that it has control
of the underlying file and only rechecks the blockedlocklist when it
receives and grants an NFS file unlock.

Consequently, contention on the hardware needs to actually cause a *fail* 
and not queue up a lock for later.  Currently, it returns a fail but 
still executes add_blockingfilelock.  The offending code in lockd_lock.c 
is:

	if (retval == PFL_NFSDENIED || retval == PFL_HWDENIED) {
		/* Once last chance to check the lock */
		if (fl->blocking == 1) {
			/* Queue the lock */
			debuglog("BLOCKING LOCK RECEIVED\n");
			retval = (retval == PFL_NFSDENIED ?
			    PFL_NFSBLOCKED : PFL_HWBLOCKED);
			add_blockingfilelock(fl);
			dump_filelock(fl);
		} else {

A possible fix should be:

		if (fl->blocking == 1) {
			if (retval == PFL_NFSDENIED) {
	                        /* Queue the lock */
	                        debuglog("BLOCKING LOCK RECEIVED\n");
	                        retval = PFL_NFSBLOCKED;
	                        add_blockingfilelock(fl);
	                        dump_filelock(fl);
			} else {
				/* retval is okay as PFL_HWDENIED */
				debuglog("BLOCKING LOCK DENIED IN HARDWARE\n");
	                        dump_filelock(fl);
			}
		} else {

This should cause the server to return nlm4_denied and the client should 
eventually retry the lock rather than waiting on the server.

CAUTION!  I haven't checked or compiled this code.  If folks need me to, I 
can, but it will be a couple of days as I don't have two machines handy 
that I can install -CURRENT on and set up NFS.

-a
Received on Mon May 12 2003 - 23:25:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:07 UTC