Re: should a copy_file_range(2) syscall be interrupted via a signal

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Fri, 5 Jul 2019 15:51:49 +0000
Mark Johnston wrote:
>On Fri, Jul 05, 2019 at 12:28:51AM +0000, Rick Macklem wrote:
>> Hi,
>>
>> I have been working on a Linux compatible copy_file_range(2) syscall
>> (the current code can be found at https://reviews.freebsd.org/D20584).
>>
>> One outstanding issue is how it should deal with signals.
>> Right now, I have vn_start_write() without PCATCH, so that it won't be
>> interrupted by a signal, but I notice that vn_write() {ie. write syscall } does
>> have PCATCH on vn_start_write() and so does vn_rdwr() when it is called
>> without IO_NODELOCKED.
>>
>> I am thinking that copy_file_range(2) should do this also.
>> However, if it returns an error, it is impossible for the caller to know how much
>> of the data range got copied.
>
>Couldn't copy_file_range() return the number of bytes copied in this
>case?  (The Linux man page notes that short writes are possible.) I
>would expect to see the same error handling that we have in
>dofilewrite(), where certain errnos are squashed.
I think this would be a good approach for local file systems, since I believe that
the only place that EINTR can be generated is the vn_start_write() call, since
vn_rdwr(IO_NODELOCKED) never returns it and the call completes before
returning.

As such, the EINTR happens at a "well known" place in the copy and a return of
the bytes copied should be fine.

Now, for NFS, it gets a little weird...
- For NFSv3, many use the "intr" mount option, which means that a VOP_WRITE()
  can return EINTR and the caller doesn't know if the write succeeded on the NFS
  server or not.
  --> Returning "bytes copied" instead of an error for this case doesn't seem
       appropriate to me, since there is no way to know if the last write happened?
However, "intr" is not recommended for NFSv4 and NFSv4.2 is the only case where
there is an RPC to do this on the server.

Maybe nfs_copy_file_range() shouldn't "hide" EINTR, although the local file
systems do so.

I think sounds like a good approach.
What do others think?

>> What do you think the copy_file_range(2) code should do?
>
>I'd find it surprising if copy_file_range() isn't interruptible.
I'll admit I haven't tested on Linux, so I don't know what happens there.
The Linux man page doesn't mention EINTR, but I don't know what happens
for a Linux "intr" NFS mount. I do have a Linux system for testing, but it is the
same system I have been using to test this syscall on FreeBSD. Maybe I need to
boot/play around with it.

I do think returning "bytes copied" instead of EINTR is a good idea, where practical.

Thanks for the comments, rick
Received on Fri Jul 05 2019 - 13:51:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC