> I believe the problem is that handle_workitem_remove() is putting the > the dirrem on the inodep inowait list, but it is never getting moved to > the inodep bufwait list because ffs_update() and > softdep_update_inodeblock() are not getting called for the leaf > directory after the dirrem is put on the inowait list if the link count > is too large. Correct. Running the commands (on an idle system) levels=280 dirchain=`jot $levels | tr '\n' '/'` mkdir -p $dirchain fsync $dirchain rm -rf 1 and monitoring the number of dirrem structures allocated in the kernel (while sleep 1; do vmstat -m | grep dirrem; done) shows that the number of dirrem structures slowly decreases. In this scenario, the rundown still happens since the link count on the inodes are normal. When the rundown doesn't start due to an elevated link count on the leaf inode then a situation might occur where there are no dirty blocks and no softupdate depdendecies for the file system on the global work list while some inodedep and dirrem dependencies for that file system are still lingering. ffs_sync() doesn't detect these lingering dependencies, and vfs_write_suspend() returns without any errors, indicating that the file system has been suspended. > In the normal case, it appears that the dirrem migration is triggered > when the inode is zeroed in ufs_inactive(), which happens when the first > call to handle_workitem_remove() calls vput(). Intermediate nodes ends up waiting for the child inode being zeroed and then written to disk. > Perhaps the dirrem should be put on the inowait list before the call to > ffs_truncate(). If softdep_slowdown() returns a nonzero value then ffs_truncate() can call ffs_syncvnode() before di_size has been set to 0. If the inodeblock is written due to fsync() operations on other inodes in the same inodeblock then the dirrem dependency would be moved to the global work list too early. Enclosed is a patch that forces an ffs_update() call from ufs_inactive() by setting the IN_CHANGE flag if i_effnlink is larger than 0 right before the call to vput(). An alternative is checking i_nlink instead of i_effnlink for faster rundown. - Tor Egge Index: sys/ufs/ffs/ffs_softdep.c =================================================================== RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_softdep.c,v retrieving revision 1.184 diff -u -r1.184 ffs_softdep.c --- sys/ufs/ffs/ffs_softdep.c 5 Sep 2005 22:14:33 -0000 1.184 +++ sys/ufs/ffs/ffs_softdep.c 24 Sep 2005 18:31:04 -0000 _at__at_ -3477,6 +3477,8 _at__at_ } WORKLIST_INSERT(&inodedep->id_inowait, &dirrem->dm_list); FREE_LOCK(&lk); + if (ip->i_effnlink > 0) + ip->i_flag |= IN_CHANGE; vput(vp); }Received on Sat Sep 24 2005 - 17:08:14 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:44 UTC