Re: umount -f implementation

From: Kostik Belousov <kostikbel_at_gmail.com> Date: Tue, 30 Jun 2009 22:32:48 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:50 UTC

On Tue, Jun 30, 2009 at 12:01:21PM -0400, Rick Macklem wrote:
> 
> 
> On Mon, 29 Jun 2009, Attilio Rao wrote:
> 
> >
> >While that should be real in principle (immediate shutdown of the fs
> >operation and unmounting of the partition) it is totally impossible to
> >have it completely unsleeping, so it can happen that also umount -f
> >sleeps / delays for some times (example: vflush).
> >Currently, umount -f is one of the most complicated thing to handle in
> >our VFS because it puts as requirement that vnodes can be reclaimed in
> >any moment, adding complexity and possibility for races.
> >
> >What's the fix for your problem?
> >
> >From other responses, it does look like pursuing this is appropriate
> and that current behaviour is considered a bug.
> 
> I should have noted in the previous email that I suspected that my simple 
> patch didn't handle all cases, which I have just determined via testing.
> 
> Unfortunately, the thread doing "umount" can also get stuck in an msleep() 
> while waiting for the mnt_lockref to go to 0, which happens before the
> VFS_UNMOUNT() call. (mnt_lockref gets incremented by various system
> calls that call vfs_busy().)
> 
> I think I can fix this in the experimental nfsv4 client, since it has
> a kernel thread that can check for MNTK_UNMOUNTF being set and then
> kill off the RPCs in progress, but that won't help the regular client.
This solution sounds good, but see below.

> 
> It's starting to look like too much work for FreeBSD8, but sounds like
> it is worth pursuing. (Appologies to anyone that thought I would have it
> all fixed in a day or two.)

It may be argued by some people, me included, that umount -f shall not
override any ownership of kernel resources. In particular, you must
not ignore the lockref. Instead, the threads that own misc filesystem
resources, like mount reference counter, locked vnodes etc shall be
weed out of the syscalls. E.g., finishing stalled rpc calls with some
error code that is propagated to return code from vops is good solution.

Quite similar problems happen with SIGSTOP and intr NFS mounts.
You saw the proposed solution that is quite similar, it forces the
threads owning the resources to progress to syscall boundary.

Another problem with forced unmounts is that VFS does not block new
threads from arriving into VOPs. When finishing the inflight rpcs,
you may either leave some new rpcs behind or loop infinitely chasing
rpcs that arrive while you finishing old rpcs.

Half-measure is the filesystem suspension, that keeps operations
that modify filesystem from entering VOPs. UFS uses suspension
for unmounts and rw->ro remounts.

Umount -f is needed in two different situations, one is normally worked
filesystem that shall be unmounted by administrative request, detaching
any resources opened by application. Second is the last-resort action
when backing storage (server in NFS case, disk for UFS) is misbehaving.
I think we must not break first case for the second.