hang in rpccon from interrupting NFS operations (Re: pointyhat panic)

From: Kris Kennaway <kris_at_FreeBSD.org>
Date: Sun, 21 Jun 2009 13:10:55 +0100
John Baldwin wrote:
> On Tuesday 16 June 2009 3:33:55 am Erwin Lansing wrote:
>> On Mon, Jun 15, 2009 at 02:08:17PM -0700, Kip Macy wrote:
>>> This is from the RPC re-work. I had thought that this was fixed. You
>>> shouldn't see this on the latest -CURRENT, but Doug will have more
>>> details.
>> Any datepoint when these fixes went in?  I upgraded pointyhat last month
>> exactly to get the latest fixes in, but could be there were more since
>> then.
> 
> You want the socket upcall locking changes in 193272 (committed June 1).  You 
> will also want subsequent commits to the RPC and NFS code by Rick Macklem to 
> close a few more races.  I think Rick still has one other patch that pho_at_ is 
> stress testing as well.
> 

Got another deadlock after upgrading.  Again, busy NFS volume, and 
^C'ing a recursive find hung in rpccon state:

db> bt 89596
Tracing pid 89596 tid 102493 td 0xffffff0089260000
sched_switch() at sched_switch+0x17c
mi_switch() at mi_switch+0x21d
sleepq_switch() at sleepq_switch+0x123
sleepq_timedwait() at sleepq_timedwait+0x4d
_sleep() at _sleep+0x301
clnt_reconnect_call() at clnt_reconnect_call+0x5d3
nfs_request() at nfs_request+0x225
nfs_statfs() at nfs_statfs+0x197
__vfs_statfs() at __vfs_statfs+0x28
kern_fstatfs() at kern_fstatfs+0x286
fstatfs() at fstatfs+0x34
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xd0
--- syscall (397, FreeBSD ELF64, fstatfs), rip = 0x800726dcc, rsp = 
0x7fffffffe1a8, rbp = 0x1000 ---

These are mounted with intr, I'll try disabling that next.

Kris
Received on Sun Jun 21 2009 - 10:10:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:50 UTC