On Apr 28, 2009, at 2:11 PM, Artem Belevich wrote: > My system had eventually deadlocked overnight, though it took much > longer than before to reach that point. > > In the end I've got many many processes sleeping in zio_wait with no > disk activity whatsoever. > I'm not sure if that's the same issue or not. > > Here are stack traces for all processes -- http://pastebin.com/f364e1452 > I've got the core saved, so if you want me to dig out some more info, > let me know if/how I could help. It looks like there is a possible deadlock between zfs_zget() and zfs_zinactive(). They both acquire a lock via ZFS_OBJ_HOLD_ENTER(). The zfs_zinactive() path can get called indirectly from within zio_done(). The zfs_zget() can in turn block waiting for zio_done()'s completion while holding the object lock. The following patch might help: http://www.wanderview.com/svn/public/misc/zfs/zfs_zinactive_deadlock.diff This simply bails out of the inactive processing if the object lock is already held. I'm not sure if this is 100% correct or not as it cannot verify there are references to the vnode. I also tried executing the zfs_zinactive() logic in a taskqueue to avoid the deadlock, but that caused other deadlocks to occur. Hope that helps. - BenReceived on Tue Apr 28 2009 - 18:52:26 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC