Re: LOR on nfs: vfs_vnops.c:301 kern_descrip.c:1580

From: pluknet <pluknet_at_gmail.com>
Date: Tue, 17 Aug 2010 19:42:41 +0400
2010/8/16 Kostik Belousov <kostikbel_at_gmail.com>:
> On Mon, Aug 16, 2010 at 09:07:24PM +0400, pluknet wrote:
>> On 16 August 2010 21:05, pluknet <pluknet_at_gmail.com> wrote:
>> > Hi.
>> >
>> > Seeing on mostly idle, recently updated current, while closing a file.
>> > Presumably never reported on ML.
[...]
>>
> Both LORs are valid. The fork performed deep inside the VFS call stack
> is obviously problematic. As a workaround, you may fix the number of
> nfsiods.
>
> Proper fix might consist of creating a shepherd thread which only task
> is to act on the requests on creating new nfsiods.
>
> Would you try to implement this ? I will provide the assistance, if needed.

Hmm.. I tried to move kproc_create() under shepherd thread and now stuck
with cp process lockup in [bo_wwait] when cp'ing something on nfs: cp a b.
Did I screw up something?
See weird draft patch attached (weird, as I have no idea how to nicely
exchange data between nfs_nfsiodnew() and shep_thread() thread).

load: 1.34  cmd: cp 1348 [bo_wwait] 4.74r 0.00u 0.00s 0% 1204k

tst-web# procstat -k 1348
  PID    TID COMM             TDNAME           KSTACK
 1348 100095 cp               -                mi_switch sleepq_switch
sleepq_wait _sleep bufobj_wwait nfs_flush nfs_close vn_close
vn_closefile _fdrop closef kern_close syscallenter syscall
Xfast_syscall

Process 1347 (cp) thread 0xffffff0002ed7000 (100094)
exclusive lockmgr nfs (nfs) r = 0 (0xffffff006a05a638) locked _at_
/usr/src/sys/kern/vfs_vnops.c:301

(kgdb) bt
#0  sched_switch (td=0xffffff0002ed7000, newtd=0xffffffff80ca17e0,
flags=Variable "flags" is not available.
)
    at /usr/src/sys/kern/sched_ule.c:1848
#1  0xffffffff805bf49b in mi_switch (flags=260, newtd=0x0)
    at /usr/src/sys/kern/kern_synch.c:449
#2  0xffffffff805f50e3 in sleepq_switch (wchan=0xffffff006a05a720, pri=77)
    at /usr/src/sys/kern/subr_sleepqueue.c:530
#3  0xffffffff805f5ccd in sleepq_wait (wchan=0xffffff006a05a720, pri=77)
    at /usr/src/sys/kern/subr_sleepqueue.c:609
#4  0xffffffff805bfac9 in _sleep (ident=0xffffff006a05a720,
    lock=0xffffff006a05a6c0, priority=Variable "priority" is not available.
) at /usr/src/sys/kern/kern_synch.c:234
#5  0xffffffff80633083 in bufobj_wwait (bo=0xffffff006a05a6c0,
slpflag=Variable "slpflag" is not available.
)
    at /usr/src/sys/kern/vfs_bio.c:4016
#6  0xffffffff8077f5af in nfs_flush (vp=0xffffff006a05a5a0, waitfor=1,
commit=Variable "commit" is not available.
)
    at /usr/src/sys/nfsclient/nfs_vnops.c:3216
#7  0xffffffff807802e3 in nfs_close (ap=0xffffff8029bd8900)
    at /usr/src/sys/nfsclient/nfs_vnops.c:644
#8  0xffffffff8065b3fe in vn_close (vp=0xffffff006a05a5a0, flags=2,
    file_cred=0xffffff006a01b600, td=0xffffff0002ed7000) at vnode_if.h:225
#9  0xffffffff8065b4fa in vn_closefile (fp=0xffffff00027c7500,
    td=0xffffff0002ed7000) at /usr/src/sys/kern/vfs_vnops.c:942
#10 0xffffffff8057e3e3 in _fdrop (fp=0xffffff00027c7500, td=Variable
"td" is not available.
) at file.h:277
#11 0xffffffff8057ff4b in closef (fp=0xffffff00027c7500, td=0xffffff0002ed7000)
    at /usr/src/sys/kern/kern_descrip.c:2117
---Type <return> to continue, or q <return> to quit---
#12 0xffffffff80580530 in kern_close (td=0xffffff0002ed7000, fd=4)
    at /usr/src/sys/kern/kern_descrip.c:1162

-- 
wbr,
pluknet

Received on Tue Aug 17 2010 - 13:42:43 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC