Re: post ino64: lockd no runs?

From: David Wolfskill <david_at_catwhisker.org>
Date: Sun, 11 Jun 2017 11:12:25 -0700
On Sun, Jun 04, 2017 at 08:57:44AM -0400, Michael Butler wrote:
> It seems that {rpc.}lockd no longer runs after the ino64 changes on any
> of my systems after a full rebuild of src and ports. No log entries
> offer any insight as to why :-(
> 
> 	imb

I don't tend to use NFS on my systems that are running head, so I
haven't had occasion to test this as stated.

However, I just completed my weekly update of the "prooduction" systems
here at home, running stable/11.  And I find that lockd seems to be ...
claiming that all is well, but declining to run (for long).

To the best of my knowledge, that was not the case until this last
update, which was from:

FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316  r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root_at_freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

to

FreeBSD albert.catwhisker.org 11.1-BETA1 FreeBSD 11.1-BETA1 #322  r319823M/319823:1100514: Sun Jun 11 03:56:10 PDT 2017     root_at_freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

The "glaringly obvious" symptom in my case is that I am now unable
to (directly) save an email message from within mutt(1) by appending
it to an NFS-resident file.  (Saving it to a local file, then using
cat(1) to append that to the NFS- resident file & removing the local
copy works....)

After a few variations on a theme of:

albert(11.1)[5] sudo service lockd restart
lockd not running?
Starting lockd.
albert(11.1)[6] echo $?
0
albert(11.1)[7] service lockd status
lockd is not running.

I finally(!) thought to ask ktrace what's going on (as tailing
/var/log/messages was completely unproductive, even after enabling
rc_debug).

So I tried: "sudo ktrace -di service lockd restart"; upon exanimation of
the output of kdump(1), I see that the trace ends with:

  ...
  2811 rpc.lockd NAMI  "/var/run/logpriv"
  2786 sh       CALL  read(0xa,0x627fc0,0x400)
  2786 sh       GIO   fd 10 read 0 bytes
       ""
  2811 rpc.lockd RET   connect 0
  2786 sh       RET   read 0
  2811 rpc.lockd CALL  sendto(0x3,0x7fffffffe2c0,0x27,0,0,0)
  2786 sh       CALL  exit(0)
  2811 rpc.lockd GIO   fd 3 wrote 39 bytes
       "<30>Jun 11 15:43:10 rpc.lockd: Starting"
  2811 rpc.lockd RET   sendto 39/0x27
  2811 rpc.lockd CALL  sigaction(SIGALRM,0x7fffffffec20,0)
  2811 rpc.lockd RET   sigaction 0
  2811 rpc.lockd CALL  nlm_syscall(0,0x1e,0x4,0x801015040)
  2811 rpc.lockd RET   nlm_syscall -1 errno 14 Bad address
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffea40)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  exit(0x1)

Then, when I tried to send this message, I started getting more whines
from mutt(1).  I finall gave up and rebooted from the previous
environment:

FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316  r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root_at_freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

and lockd is running:

albert(11.1-P)[2] service lockd status
lockd is running as pid 629.
albert(11.1-P)[3] 

so mutt(1) is not pitchng a hisssy-fit every time I try to save or
send a message.


In light of the above, I have Bcced: this message to current_at_ (where
the thread originated) and sent it (and set replies) to stable_at_.


I have a test system, last updated to stable/11 as of mid-October
last year; lockd was running on it, as well (which is why I tried
going back to last week's image).  I'm happy to update it to points
where lockd may be broken, if it might help figure out what's broken
and how to fix it.

Peace,
david
-- 
David H. Wolfskill				david_at_catwhisker.org
Looking forward to telling Mr. Trump: "You're fired!"

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

Received on Sun Jun 11 2017 - 16:12:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC