A bit more data and another question. On Sun, 2006-01-15 at 12:40 -0800, Frank Mayhar wrote: > In nfs_reclaim(), just before he calls vnode_destroy_vobject(), he > zfrees and clears vp->v_data. When, down in the guts of vm_object.c, he > tries to flush the associated pages, v_data is already NULL so he goes > boom. > > Now, why does he do the zfree/clear before vnode_destroy_vobject()? Is > he assuming that there are no pages associated with this vnode that need > to be flushed? Should there be? I looked at some other file systems and > they do the same thing. The obvious fix is to move the zfree/clear to > after the vnode_destroy_vobject() but if there should be no pages that > need to be flushed on the vnode at this point, that would just hide the > problem. Looking further down, at vlrureclaim(), I see that the commentary for vlrureclaim() specifically says that a a flushed vnode may still have backing store, so it appears that yes, there may be pages associated with the vnode when he calls vgonel(). Between vgonel() and nfs_reclaim() there's just VOP stuff, so the flushing has to be done lower down. The nfs_reclaim() routine itself just does some bookkeeping and then calls vnode_destroy_vobject(). That routine can push pages out, which means that if the backing store is on NFS, nfs_putpages() can be called. But that routine will fault because he'll try to use v_data as an nfsnode. The reason for my confusion is that of the filesystems in the tree, the only one that doesn't zfree and clear v_data before calling vnode_destroy_vobject() is UFS. The commentary in ufs_reclaim() is clear, though: /* * Destroy the vm object and flush associated pages. */ vnode_destroy_vobject(vp); Then later he VI_LOCKS() and clears v_data. (And [indirectly] does the zfree only _after_ that, which is interesting but probably not important.) I'm going to go slightly out on a limb here and guess that the "flush associated pages" thing came in relatively recently and the other filesystems haven't caught up with it. This implies that the proper fix is to go through those other xxx_reclaim() routines and reorder the operations. That's easy enough to do, but I would like to make sure that my understanding of this (and my guess) is correct and that I'm not wasting my time. Thanks! -- Frank Mayhar frank_at_exit.com http://www.exit.com/ Exit Consulting http://www.gpsclock.com/ http://www.exit.com/blog/frank/Received on Sun Jan 15 2006 - 21:45:21 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:50 UTC