dev_lock() contention for fdesc syscalls -- possible fix

From: Luigi Rizzo <rizzo_at_iet.unipi.it>
Date: Mon, 10 Nov 2014 02:49:39 +0100
It was noticed that there is huge dev_lock() contention when multiple
processes do a poll() even on independent file descriptors.

Turns out that not just poll but most syscalls on file descriptors
(as opposed to sockets) in sys/fs/devfs/devfs_vnops.c including
devfs_poll_f(), devfs_ioctl_f() and read/write share the problem
as they use the following pattern

    devfs_poll_f() {
        ...
        devfs_fp_check(fp, ...) -->
            kern/kern_conf.c :: devvn_refthread(fp->f_vnode, ...) -->
                dev_lock();
                dev = vp->v_rdev; // lock on vp ?
                ... check that dev != NULL
                atomic_add_long(&dev->si_threadcount, 1);
                dev_unlock();
        dsw->d_poll()
        dev_relthread() -->
            atomic_subtract_rel_long(&dev->si_threadcount, 1);
   }


I believe that dev_lock() in devvn_refthread() is protecting
        dev = vp->v_rdev
(the cdev entry 'dev' cannot go away for the reasons stated below).

However looking at places where vp->v_rdev is modified, turns out
that it only happens when holding VI_LOCK(vp) -- the places are
devfs_allocv() and devfs_reclaim().
There is one place in zfs where the vnode is created and v_rdev
is set without holding a lock, so nobody can dereference it.

As a consequence i believe that if in devfs_fp_check() we replace
dev_lock() / dev_unlock() with VI_LOCK(vp) / VI_UNLOCK(vp),
we make sure that we can safely dereference vp->v_rdev, and the
cdev cannot go away because the vnode has a reference to it.
The counter uses an atomic instruction (so the release is lockless)

This should be enough to remove the contention.

Anyone care to review/test the patch below ?
(dev_refthread() still has the single dev_lock() contention,
i don't know how to address that yet)

	cheers
	luigi

Index: /home/luigi/FreeBSD/head/sys/kern/kern_conf.c
===================================================================
--- /home/luigi/FreeBSD/head/sys/kern/kern_conf.c	(revision 273452)
+++ /home/luigi/FreeBSD/head/sys/kern/kern_conf.c	(working copy)
_at__at_ -224,10 +224,11 _at__at_
 	}
 
 	csw = NULL;
-	dev_lock();
+	ASSERT_VI_UNLOCKED(vp, __func__);
+	VI_LOCK(vp); // dev_lock();
 	dev = vp->v_rdev;
 	if (dev == NULL) {
-		dev_unlock();
+		VI_UNLOCK(vp); // dev_unlock();
 		return (NULL);
 	}
 	cdp = cdev2priv(dev);
_at__at_ -236,7 +237,7 _at__at_
 		if (csw != NULL)
 			atomic_add_long(&dev->si_threadcount, 1);
 	}
-	dev_unlock();
+	VI_UNLOCK(vp); // dev_unlock();
 	if (csw != NULL) {
 		*devp = dev;
 		*ref = 1;
Received on Mon Nov 10 2014 - 00:45:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:53 UTC