Re: hang in rpccon from interrupting NFS operations (Re: pointyhat panic)

From: Adrenalin <adrenalinup_at_gmail.com>
Date: Wed, 10 Mar 2010 00:19:30 +0100
Hi, I would like to know if this bug has been fixed in the FreeBSD 8 Release
since I got it 3 times already on a busy box that use heavily NFS (with lots
of files).
Unfortunately my processes are not compiled with debug symbols(so I cannot
get an backtrace), but I've got all the php-cgi stuck in the "rpccon" state
just like described here, I cannot kill them and I cannot cleanly reboot,
manual restart is required.

FreeBSD g4.torrentsmd.com 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov 21
15:02:08 UTC 2009 root_at_mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
amd64

4063 www          1  52    0 82576K 26320K rpccon  3   1:40  0.00% php-cgi
4078 www          1  48    0 83600K 26768K rpccon  1   1:37  0.00% php-cgi
4129 www          1  52    0 83600K 26740K rpccon  1   1:31  0.00% php-cgi
4159 www          1  55    0 82832K 26216K rpccon  0   1:24  0.00% php-cgi
4184 www          1  54    0 90768K 34104K rpccon  0   1:16  0.00% php-cgi
4174 www          1  50    0 82832K 23396K rpccon  0   1:15  0.00% php-cgi
4258 www          1  55    0 82064K 24224K rpccon  1   1:06  0.00% php-cgi

I belive the error was triggered when
Mar  9 20:00:31 sv kernel: nfs server s:/path/pah/paf: not responding
Mar  9 20:00:36 sv last message repeated 23 times

My fstab look like this, I use the -b flag
sv:/path/pah/paf /path/fap/hap/afh nfs  rw,-b 0 0

Since it's a remote box and I'm afraid to screw up the kernel recompilation
of the "Stable", and I'm not even sure it will help, do you have any
suggestions ? Thank you.

Nicu.

On Mon, Jun 22, 2009 at 2:21 AM, Rick Macklem <rmacklem_at_uoguelph.ca> wrote:

>
>
> On Sun, 21 Jun 2009, Kris Kennaway wrote:
>
>
>> Got another deadlock after upgrading.  Again, busy NFS volume, and ^C'ing
>> a recursive find hung in rpccon state:
>>
>> db> bt 89596
>> Tracing pid 89596 tid 102493 td 0xffffff0089260000
>> sched_switch() at sched_switch+0x17c
>> mi_switch() at mi_switch+0x21d
>> sleepq_switch() at sleepq_switch+0x123
>> sleepq_timedwait() at sleepq_timedwait+0x4d
>> _sleep() at _sleep+0x301
>> clnt_reconnect_call() at clnt_reconnect_call+0x5d3
>> nfs_request() at nfs_request+0x225
>> nfs_statfs() at nfs_statfs+0x197
>> __vfs_statfs() at __vfs_statfs+0x28
>> kern_fstatfs() at kern_fstatfs+0x286
>> fstatfs() at fstatfs+0x34
>> syscall() at syscall+0x1af
>> Xfast_syscall() at Xfast_syscall+0xd0
>> --- syscall (397, FreeBSD ELF64, fstatfs), rip = 0x800726dcc, rsp =
>> 0x7fffffffe1a8, rbp = 0x1000 ---
>>
>> These are mounted with intr, I'll try disabling that next.
>>
>>  There are two sleeps in clnt_rc.c. One of them optionally does a PCATCH
> and returns when interrupted via ^C, but the other one (which it is
> sleeping on above), doesn't. I've emailed Kris a small patch that
> changes that for him to test.
>
> If anyone else wants to test the patch, just email me for a copy, rick
>
>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
Received on Tue Mar 09 2010 - 22:44:38 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:01 UTC