Re: copyin+copyout in one step ?

From: Alfred Perlstein <bright_at_mu.org> Date: Mon, 27 May 2013 18:10:11 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:38 UTC

On 5/27/13 4:56 PM, Alfred Perlstein wrote:
> On 5/27/13 4:38 PM, Luigi Rizzo wrote:
>> Hi,
>> say a process P1 wants to use the kernel to copy the content of a
>> buffer SRC (in its user address space) to a buffer DST (in the
>> address space of another process P2), and assume that P1 issues the
>> request to the kernel when P2 has already told the kernel where the
>> data should go:
>>
>>           P1                        P2
>>          +------+                 +--------+
>>          | SRC  |                 | DST    |
>>          +--v---+                 +--^-----+
>>     --------|------------------------|----------
>>             |                        |    kernel
>>             |                        ^
>>
>>             |                        |
>>             |      +--------+        |
>>             +----->| tmpbuf +--------+
>>              copyin|        | copyout
>>              P1 ctx+--------+  P2 ctx
>>
>> I guess the one above is the canonical way: P1 does a copyin() to a
>> temporary buffer, then notifies P2 which can then issue or complete
>> a syscall to do a copyout from tmpbuf to DST in P2's context.
>>
>>
>> But I wonder, is it possible to do it as follows: P2 tells the kernel
>> where the data should go (DST); later, P1 issues a system call and
>> through a combined "copyinout()" moves data directly from SRC to DST,
>> operating in the context of P1.
>>
>>             |      copyinout() ?     |
>>             +------------>-----------+
>>                     issued by P1
>>
>>
>> Is this doable at all ? I suspect that "tell DST to the kernel"
>> might be especially expensive as it needs to pin the page
>> so it is accessible while doing the syscall for P1 ?
>> (the whole point for this optimization is saving the extra
>> copy through the buffer, but it may be pointless if pinning
>> the memory is more expensive than the copy)
>>
> I suspect you'll want to use something like vslock(9) and sf_bufs.
>
> Have a look at vm/vm_glue.c -> vslock() vm_imgact_hold_page().
>
> On amd64, I *think* mapping an sfbuf or if you are really evil you can 
> optimistically wire the page in the vm (cheap). If it's present then 
> you can just use the direct map to access it. However, if it's not 
> present, then fall back to another  method, or maybe just fault it in 
> (which will have to happen anyhow) and then retry.
>
> Sounds like a cool project!
>
> -Alfred
Oh, one other thing.. look at the pipe code.  It used to do what you 
suggest, I think however it was driven by the READER pinning the 
WRITER's address space and doing a direct copy.  However it may not be 
optimized for NOT-mapping into kva as I suggested doing.

-Alfred