Re: rpc.lockd(8) seg faults on 5.2-RELEASE

From: Frode Nordahl <frode_at_nordahl.net> Date: Thu, 5 Feb 2004 18:59:22 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:41 UTC

Hello,

Got an update on the rpc.lockd "hang" issue.

Whenever I observe it does this, I try to kill it off using kill -SEGV 
before restarting it.

In one of the dumps I observed this:
(gdb) print *blockedlocklist_head->lh_first
$1 = {nfslocklist = {le_next = 0x8099000, le_prev = 0x8099000}, 
filehandle = {
     fh_fsid = {val = {1074502253, -394432445}}, fh_fid = {fid_len = 12,
       fid_reserved = 0, fid_data = "?\\_at_\0r?\202[\0\0\0\0\0\0\0"}},
   addr = 0x80751e0, client = {exclusive = 1, svid = 19869, oh = {n_len 
= 24,
       n_bytes = 0x8056520 "19869_at_mail7.powertech.no", '?' <repeats 176 
times>...}, l_offset = 0, l_len = 0}, client_cookie = {n_len = 4,
     n_bytes = 0x8075290 "?\221K\001", '?' <repeats 28 times>, "udp6"},
   client_name = "mail7.powertech.no", '\0' <repeats 1005 times>,
   nsm_status = 0, status = 0, flags = 6, blocking = 0, locker = 0, fd = 
0}
(gdb)

Looking at retry_blockingfilelocklist(), this kind of data in 
blockedlocklist_head would most likely make it loop forever.  I 
simulated this behaviour in my own program as well.

But how did le_next end up == le_prev?

I also found this in send_granted(): lockd_lock.c:2161

         debuglog("About to send granted on blocked lock\n");
         sleep(1);
         debuglog("Blowing off return send\n");

Anyone know what sleep(1) is good for here?

Mvh,
Frode