On Tuesday 09 June 2009 1:00:25 pm Kostik Belousov wrote: > On Tue, Jun 09, 2009 at 09:45:49AM -0700, Matthew Fleming wrote: > > > > > This appears to be an interaction with the recent changes to use > > > shared vnode locks for writes on ZFS. Hmm, I think it may be ok to > > > use a shared vnode lock for kevents on vnodes though. The vnode > > > interlock should be sufficient locking for what little work the kevent > > > > > filters do. As a quick hack for now the MNT_SHARED_WRITES() stuff > > > could avoid using shared locks 'if (!VN_KNLIST_EMPTY(vp))', but I > > > think the longer term fix is to not use the vnode locks for vnode > > kevents, but use the interlock instead. > > > > I tried (briefly) using the interlock since Isilon's vnode lock is > > cluster wide (in our 6.1 based code we got away with using Giant). This > > got me a LOR report on the interlock: > > > > /* > > * kqueue/VFS interaction > > */ > > { "kqueue", &lock_class_mtx_sleep }, > > { "struct mount mtx", &lock_class_mtx_sleep }, > > { "vnode interlock", &lock_class_mtx_sleep }, > > { NULL, NULL }, > > > > since knote() will take first the list->kl_lock and then the kqueue > > lock. I didn't spend any time on it, and switched to using the vnode > > v_lock for my purposes. But someone added that lock ordering (r166421) > > for a reason. > > That was me, I actually looked for the reversed order that was reported > several times on the list in 6.1-6.2 timeframe. Unfortunately, nothing > was found. > > I noted in the separate letter that read filter for vnodes needs > shared vnode lock anyway. So my current idea is to allow shared vnode locks and use the vnode interlock in the filt_vfs_* routines to protect access to the kn_* member variables. --- //depot/projects/smpng/sys/kern/vfs_subr.c 2009/06/09 15:15:22 +++ //depot/user/jhb/lock/kern/vfs_subr.c 2009/06/09 17:20:33 _at__at_ -4103,8 +4103,10 _at__at_ vfs_knllocked(void *arg) { struct vnode *vp = arg; + int islocked; - return (VOP_ISLOCKED(vp) == LK_EXCLUSIVE); + islocked = VOP_ISLOCKED(vp); + return (islocked == LK_SHARED || islocked == LK_EXCLUSIVE); } int _at__at_ -4114,6 +4116,7 _at__at_ struct knote *kn = ap->a_kn; struct knlist *knl; + ASSERT_VOP_ELOCKED(vp, "vfs_kqfilter"); switch (kn->kn_filter) { case EVFILT_READ: kn->kn_fop = &vfsread_filtops; _at__at_ -4147,6 +4150,7 _at__at_ { struct vnode *vp = (struct vnode *)kn->kn_hook; + ASSERT_VOP_ELOCKED(vp, "filt_vfsdetach"); KASSERT(vp->v_pollinfo != NULL, ("Missing v_pollinfo")); knlist_remove(&vp->v_pollinfo->vpi_selinfo.si_note, kn, 0); } _at__at_ -4157,48 +4161,65 _at__at_ { struct vnode *vp = (struct vnode *)kn->kn_hook; struct vattr va; + int retval; /* * filesystem is gone, so set the EOF flag and schedule * the knote for deletion. */ if (hint == NOTE_REVOKE) { + VI_LOCK(vp); kn->kn_flags |= (EV_EOF | EV_ONESHOT); + VI_UNLOCK(vp); return (1); } if (VOP_GETATTR(vp, &va, curthread->td_ucred)) return (0); + VI_LOCK(vp); kn->kn_data = va.va_size - kn->kn_fp->f_offset; - return (kn->kn_data != 0); + retval = (kn->kn_data != 0); + VI_UNLOCK(vp); + return (retval); } /*ARGSUSED*/ static int filt_vfswrite(struct knote *kn, long hint) { + struct vnode *vp = (struct vnode *)kn->kn_hook; + /* * filesystem is gone, so set the EOF flag and schedule * the knote for deletion. */ + VI_LOCK(vp); if (hint == NOTE_REVOKE) kn->kn_flags |= (EV_EOF | EV_ONESHOT); kn->kn_data = 0; + VI_UNLOCK(vp); return (1); } static int filt_vfsvnode(struct knote *kn, long hint) { + struct vnode *vp = (struct vnode *)kn->kn_hook; + int retval; + + VI_LOCK(vp); if (kn->kn_sfflags & hint) kn->kn_fflags |= hint; if (hint == NOTE_REVOKE) { kn->kn_flags |= EV_EOF; + VI_UNLOCK(vp); return (1); } - return (kn->kn_fflags != 0); + retval = (kn->kn_fflags != 0); + VI_UNLOCK(vp); + return (retval); } int -- John BaldwinReceived on Tue Jun 09 2009 - 15:27:01 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:49 UTC