Re: cp(1) of large files is causing 100% CPU utilization and poor transfer

From: Alan Somers <asomers_at_freebsd.org>
Date: Sat, 2 Jan 2021 15:08:56 -0700
LGTM!  This patch also fixes another problem: the previous version of cp,
when copying a large sparse file on UFS, would create some UFS indirect
blocks (because it would keep truncating the file to larger sizes).  The
output file would still be sparse, but it would take up more space than the
original.  IIRC about 0.2% of the empty space would get used by UFS
indirect blocks.  But your patch fixes it.

What I said earlier about needing to modify vn_generic_copy_file_range
wasn't quite correct.  I confused len with xfer when I was reading the
code.  The change I proposed to vn_generic_copy_file_range would only make
a difference if the process receives many interrupts.

And here's some background for other people reading the thread: the reason
that the initial copy_file_range implementation in cp only used a 2 MB
block size is because vn_generic_copy_file_range wasn't always
interruptible, and we didn't want cp to block for minutes or even hours
during a long transfer.  Subsequently rmacklem made
vn_generic_copy_file_range interruptible, but we never raised the block
size in cp.

-Alan

On Sat, Jan 2, 2021 at 2:42 PM Rick Macklem <rmacklem_at_uoguelph.ca> wrote:

> The attached small patch seems to fix the problem.
> My hunch is that, for a large non-sparse file, SEEK_DATA
> SEEK_HOLE takes a fairly long time.
> These are done for each copy_file_range(2) syscall.
>
> cp was doing lots of them because of the small len argument.
> Bumping the len up to SSIZE_MAX results in far fewer sycalls
> and, therefore, SEEK_DATAs and SEEK_HOLEs.
>
> Without the patch, cp took 6 times as long as dd.
> With the patch, cp takes less time than dd.
>
> I'll put the patch on the bug report. Matthias, can you test
> the patch?
>
> Thanks for reporting this, rick
> ps: All my test programs use SSIZE_MAX unless they were
>      not supposed to copy to eof, which explains why I
>      missed this. My bad, for the testing.;-)
>
> ________________________________________
> From: owner-freebsd-current_at_freebsd.org <owner-freebsd-current_at_freebsd.org>
> on behalf of Matthias Apitz <guru_at_unixarea.de>
> Sent: Saturday, January 2, 2021 3:05 PM
> To: Alan Somers
> Cc: Rick Macklem; FreeBSD CURRENT; Konstantin Belousov; Kirk McKusick
> Subject: Re: cp(1) of large files is causing 100% CPU utilization and poor
> transfer
>
> CAUTION: This email originated from outside of the University of Guelph.
> Do not click links or open attachments unless you recognize the sender and
> know the content is safe. If in doubt, forward suspicious emails to
> IThelp_at_uoguelph.ca
>
>
> El día sábado, enero 02, 2021 a las 11:29:36a. m. -0700, Alan Somers
> escribió:
>
> > > El día sábado, enero 02, 2021 a las 05:06:05p. m. +0000, Rick Macklem
> > > escribió:
> > >
> > > > Just fyi, I've reproduced the problem.
> > > > All I did was create a 20Gbyte file
> > > > on UFS on a slow (4Gbyte or RAM,
> > > > slow spinning disk) laptop.
> > > > (The UFS file system is just what the installer creates these days.)
> > > >
> > > > cp still hasn't finished and is definitely
> > > > taking a looott longer than dd did.
> > > >
> > > > I'll start drilling down later to-day.
> > > >
> > > > I'll admit doing lots of testing of copy_file_range(2)
> > > > with large sparse files, but I may have missed testing
> > > > a large non-sparse file.
> > > >
> > > > rick
> > > > ps: I've added Kostik and Kirk to the cc.
> > >
> > > As the problem seems to be clear now, should I still file a PR?
> > > I'm happy to do so.
> > >
> >
> > Yes please .  That will help ensure that we don't lose track of it.
>
> Here we go: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252358
>
> Thanks
>
>         matthias
>
> --
> Matthias Apitz, ✉ guru_at_unixarea.de, http://www.unixarea.de/
> +49-176-38902045
> Public GnuPG key: http://www.unixarea.de/key.pub
> ¡Con Cuba no te metas!  «»  Don't mess with Cuba!  «»  Leg Dich nicht mit
> Kuba an!
> http://www.cubadebate.cu/noticias/2020/12/25/en-video-con-cuba-no-te-metas/
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
Received on Sat Jan 02 2021 - 21:09:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:26 UTC