On Apr 29, 2009, at 7:47 PM, Lawrence Stewart wrote: > Ben Kelly wrote: >> On Apr 29, 2009, at 7:58 AM, Ben Kelly wrote: >>> On Apr 29, 2009, at 2:43 AM, Jaakko Heinonen wrote: >>>> On 2009-04-28, Ben Kelly wrote: >>>>>> http://www.wanderview.com/svn/public/misc/zfs/zfs_zinactive_deadlock.diff >>>>> >>>>> The patch is updated in the same location above. >>>> >>>> There's a fatal typo in the patch: >>>> >>>> - ZFS_OBJ_HOLD_ENTER(zfsvfs, z_id); >>>> + locked == ZFS_OBJ_HOLD_TRYENTER(zfsvfs, z_id); >>>> ^^^^ >>> >>> Yikes! Thanks for catching this! >>> >>> The patch has been updated at the same URL. If anyone has patched >>> their system please grab the new version. Sorry for the confusion. >> Argh! The patch was still broken even after this fix. >> Apparently when I tested my taskqueue solution I forgot to do a >> make installkernel. For some reason the taskqueue approach >> deadlocks my server at home under normal conditions. Therefore I >> have reverted the patch to use the simple return. I still don't >> think this is the right solution, but I don't have time to >> completely figure out what is going on right now. >> Again, sorry for the mess! > > As far as I can tell, one of the developers is working on a patch to > address the same issue you're discussing in this thread. He ran into > it on his SSD ZFS installation and the symptoms sound likely to be > the same as what you're discussing. I believe he's testing a patch > which is inspired by the one the opensolaris guys used to fix the > bug, which you can look at here: > > http://people.freebsd.org/~pjd/patches/vn_rele_hang.patch > > The open solaris one has major incompatibilities with FreeBSD so > can't be applied directly. > > As soon as it's ready I think he'll be making it available for wider > testing so stay tuned. > > Cheers, > Lawrence > > PS Apologies if the issue you're working on is not the same as the > one addressed by the opensolaris patch above. Thank you! This does appear to be the same issue and I look forward to seeing the final fix. For now I've gone ahead and updated my patch with a naive adaptation of the opensolaris diff. It seems more correct than what I had and I was worried people would waste time testing my broken approach. I've only been able to test it on my i386, non-SMP server however. Thanks again. - BenReceived on Wed Apr 29 2009 - 23:56:20 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC