Re: posix_fadvise noreuse disables file caching

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 25 Jan 2012 11:29:22 -0500
On Friday, January 20, 2012 2:12:13 pm John Baldwin wrote:
> On Thursday, January 19, 2012 11:39:42 am Tijl Coosemans wrote:
> > Hi,
> > 
> > I recently noticed that multimedia/vlc generates a lot of disk IO when
> > playing media files. For instance, when playing a 320kbps mp3 gstat
> > reports about 1250kBps (=10000kbps). That's quite a lot of overhead.
> > 
> > It turns out that vlc sets POSIX_FADV_NOREUSE on the entire file and
> > reads in chunks of 1028 bytes. FreeBSD implements NOREUSE as if
> > O_DIRECT was specified during open(2), i.e. it disables all caching.
> > That means every 1028 byte read turns into a 32KiB read (new default
> > block size in 9.0) which explains the above numbers.
> > 
> > I've copied the relevant vlc code below (modules/access/file.c:Open()).
> > It's interesting to see that on OSX it sets F_NOCACHE which disables
> > caching too, but combined with F_RDAHEAD there's still read-ahead
> > caching.
> > 
> > I don't think POSIX intended for NOREUSE to mean O_DIRECT. It should
> > still cache data (and even do read-ahead if F_RDAHEAD is specified),
> > and once data is fetched from the cache, it can be marked WONTNEED.
> 
> POSIX doesn't specify O_DIRECT, so it's not clear what it asks for.
> 
> > Is it possible to implement it this way, or if not to just ignore
> > the NOREUSE hint for now?
> 
> I think it would be good to improve NOREUSE, though I had sort of
> assumed that applications using NOREUSE would do their own buffering
> and read full blocks.  We could perhaps reimplement NOREUSE by doing
> the equivalent of POSIX_FADV_DONTNEED after each read to free buffers
> and pages after the data is copied out to userland.  I also have an
> XXX about whether or not NOREUSE should still allow read-ahead as it
> isn't very clear what the right thing to do there is.  HP-UX (IIRC)
> has an fadvise() that lets you specify multiple policies, so you
> could specify both NOREUSE and SEQUENTIAL for a single region to
> get read-ahead but still release memory once the data is read once.

So I've came up with this untested patch.  It uses
VOP_ADVISE(FADV_DONTNEED) after read(2) calls to a NOREUSE region, and
leaves read-ahead caching enabled for NOREUSE.  FADV_DONTNEED doesn't
do any good really for writes (it only flushes clean buffers), so I've
left write(2) operations as using IO_DIRECT still.  Does this sound
reasonable?  I've not yet tested this at all:

Index: vfs_vnops.c
===================================================================
--- vfs_vnops.c	(revision 230331)
+++ vfs_vnops.c	(working copy)
_at__at_ -519,6 +519,7 _at__at_ vn_read(fp, uio, active_cred, flags, td)
 	int error, ioflag;
 	struct mtx *mtxp;
 	int advice, vfslocked;
+	off_t offset;
 
 	KASSERT(uio->uio_td == td, ("uio_td %p is not td %p",
 	    uio->uio_td, td));
_at__at_ -558,19 +559,14 _at__at_ vn_read(fp, uio, active_cred, flags, td)
 	switch (advice) {
 	case POSIX_FADV_NORMAL:
 	case POSIX_FADV_SEQUENTIAL:
+	case POSIX_FADV_NOREUSE:
 		ioflag |= sequential_heuristic(uio, fp);
 		break;
 	case POSIX_FADV_RANDOM:
 		/* Disable read-ahead for random I/O. */
 		break;
-	case POSIX_FADV_NOREUSE:
-		/*
-		 * Request the underlying FS to discard the buffers
-		 * and pages after the I/O is complete.
-		 */
-		ioflag |= IO_DIRECT;
-		break;
 	}
+	offset = uio->uio_offset;
 
 #ifdef MAC
 	error = mac_vnode_check_read(active_cred, fp->f_cred, vp);
_at__at_ -587,6 +583,10 _at__at_ vn_read(fp, uio, active_cred, flags, td)
 	}
 	fp->f_nextoff = uio->uio_offset;
 	VOP_UNLOCK(vp, 0);
+	if (error == 0 && advice == POSIX_FADV_NOREUSE &&
+	    offset != uio->uio_offset)
+		error = VOP_ADVISE(vp, offset, uio->uio_offset - 1,
+		    POSIX_FADV_DONTNEED);
 	VFS_UNLOCK_GIANT(vfslocked);
 	return (error);
 }

-- 
John Baldwin
Received on Wed Jan 25 2012 - 15:29:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC