Re: [patch] zfs livelock and thread priorities

From: Ben Kelly <ben_at_wanderview.com>
Date: Tue, 28 Apr 2009 16:52:23 -0400
On Apr 28, 2009, at 2:11 PM, Artem Belevich wrote:
> My system had eventually deadlocked overnight, though it took much
> longer than before to reach that point.
>
> In the end I've got many many processes sleeping in zio_wait with no
> disk activity whatsoever.
> I'm not sure if that's the same issue or not.
>
> Here are stack traces for all processes -- http://pastebin.com/f364e1452
> I've got the core saved, so if you want me to dig out some more info,
> let me know if/how I could help.

It looks like there is a possible deadlock between zfs_zget() and  
zfs_zinactive().  They both acquire a lock via ZFS_OBJ_HOLD_ENTER().   
The zfs_zinactive() path can get called indirectly from within  
zio_done().  The zfs_zget() can in turn block waiting for zio_done()'s  
completion while holding the object lock.

The following patch might help:

   http://www.wanderview.com/svn/public/misc/zfs/zfs_zinactive_deadlock.diff

This simply bails out of the inactive processing if the object lock is  
already held.  I'm not sure if this is 100% correct or not as it  
cannot verify there are references to the vnode.  I also tried  
executing the zfs_zinactive() logic in a taskqueue to avoid the  
deadlock, but that caused other deadlocks to occur.

Hope that helps.

- Ben
Received on Tue Apr 28 2009 - 18:52:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC