Re: copyin+copyout in one step ?

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Tue, 28 May 2013 07:30:20 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:38 UTC

On Tue, May 28, 2013 at 01:38:01AM +0200, Luigi Rizzo wrote:
> Hi,
> say a process P1 wants to use the kernel to copy the content of a
> buffer SRC (in its user address space) to a buffer DST (in the
> address space of another process P2), and assume that P1 issues the
> request to the kernel when P2 has already told the kernel where the
> data should go:
> 
>          P1                        P2
>         +------+                 +--------+
>         | SRC  |                 | DST    |
>         +--v---+                 +--^-----+
>    --------|------------------------|----------
>            |                        |    kernel
>            |                        ^
> 
>            |                        |
>            |      +--------+        |
>            +----->| tmpbuf +--------+
>             copyin|        | copyout
>             P1 ctx+--------+  P2 ctx
> 
> I guess the one above is the canonical way: P1 does a copyin() to a
> temporary buffer, then notifies P2 which can then issue or complete
> a syscall to do a copyout from tmpbuf to DST in P2's context.
> 
> 
> But I wonder, is it possible to do it as follows: P2 tells the kernel
> where the data should go (DST); later, P1 issues a system call and
> through a combined "copyinout()" moves data directly from SRC to DST,
> operating in the context of P1.
> 
>            |      copyinout() ?     | 
>            +------------>-----------+
>                    issued by P1
> 
> 
> Is this doable at all ? I suspect that "tell DST to the kernel"
> might be especially expensive as it needs to pin the page
> so it is accessible while doing the syscall for P1 ?
> (the whole point for this optimization is saving the extra
> copy through the buffer, but it may be pointless if pinning
> the memory is more expensive than the copy)

Yes, it is doable.  If the copy can happen when either P1 or P2 are
active context, then proc_rwmem() already perform exactly what you
want.  The virtual address in the address space of the 'other process'
is specified as uio->uio_offset.  The iov specifies the region(s) for
the current process.

If you want to perform the copy from P1 to P2 while some other context
P3 is active, the same structure as proc_rwmem() would work, but you 
obviously would need to do vm_fault_hold() for both sides, and use
pmap_copy_pages() instead of uiomove_fromphys().

In either case, you get copy without temporal buffer, but the setup cost
could be non-trivial. You never know until measured.