Re: protection against module unloading ?

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Mon, 13 Jul 2015 15:46:03 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:58 UTC

On Mon, Jul 13, 2015 at 02:28:40PM +0200, Luigi Rizzo wrote:
> Hi,
> I am trying to understand how to protect efficiently against
> module removals when a device driver is in use.
> This issue came up some time ago when trying netmap
> loaded as a module.
> 
> --- Summary of the current mode of operation (as i understand it): ---
> 
> When a device is compiled as a module, all devfs calls in
> sys/fs/devfs/devfs_vnops.c are protected by setting
> dev->si_threadcount (within dev*_refthread()) for the duration of the call.
> 
> This is implemented using devfs_fp_check()/dev_relthread() which
> in turn, for the module case, updates the dev->si_threadcount under
> dev_lock() which is a single shared mutex, devmtx .
> 
> The above is problematic for driver modules that do a lot of I/O
> using poll, ioctl, read or write, because of the system-wide contention
> on devmtx
> 
> To mitigate the contention, statically compiled modules have the
> SI_ETERNAL attribute that prevents the lock (major contention point)
> and si_threadcount update.
> 
> --- Alternative ?
> 
> I was hoping to make the protection mechanism cheaper by only
> increasing  dev->si_threadcount on devfs_open() without calling
> dev_relthread() ), and then decreasing si_threadcount on defvs_close() .
> This way the reference is active for the lifetime of the file descriptor
> without needing to track individual high-frequency calls.
> 
> This is probably not enough because there could be mmap handlers,
> which remain active even after the device is closed.
This is of course wrong because destroy_dev(9) would be blocked until
all file descriptors referencing the device are closed.  The blocked
thread which called destroy_dev() is blocked hard, i.e. you cannot
interrupt it with a signal.  The situation is incomprehensible for
the user issued the kldunload command.

The current si_threadcount mechanism is designed and intended to work on
the code call granulariry, to allow the device destruction and driver
unload when there is no code call into the driver.

> 
> However I do not think the current devfs_fp_check() suffices either,
> and so drivers that have their own mmap code in a module that can be
> unloaded must already implement their own mechanism to protect
> against unloading.
Drivers which use (old) device pager cannot control the lifespan of
the mmapped regions.  The only way to properly track and clean up the
mappings is to use the MGTDEVICE pager, which creates the managed
fictitious mapping for the device mmaps.  Device destruction causes
correct removal of the ptes for the fictitious pages created by the
driver page fault handler.  See examples in the GEM and TTM.

> In fact, this could be achieved easily by making these drivers use
> dev_refthread() to hold a refcount while active.
No, this is reversed and breaks the correct mechanism.

> 
> Would the above work ? Anything missing ?
> 
> cheers
> luigi
> 
> -- 
> -----------------------------------------+-------------------------------
>  Prof. Luigi RIZZO, rizzo_at_iet.unipi.it  . Dip. di Ing. dell'Informazione
>  http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
>  TEL      +39-050-2217533               . via Diotisalvi 2
>  Mobile   +39-338-6809875               . 56122 PISA (Italy)
> -----------------------------------------+-------------------------------
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"