Order of files with 'cp'

From: Brian Candler <B.Candler_at_pobox.com>
Date: Wed, 16 Nov 2005 16:15:40 +0000
I've noticed on FreeBSD-5.4 and -6.0 that the order in which 'cp' copies
multiple files does not match the order they're given on the command line.
This is noticeable when the target server is remote and/or slow (e.g. NFS;
USB flash device).

I guess it's not especially important, but it's slightly annoying in that I
have a dumb USB MP3 player, and it plays the tracks in the raw order they
appear in the filesystem, not by sorting filenames or anything like that.

I've had a look through the code, and it seems that cp calls fts_open() with
the list of files in argv; fts_open then does a qsort() on the arguments,
using the comparison function mastercmp() provided by cp:

/*
 * mastercmp --
 *      The comparison function for the copy order.  The order is to copy
 *      non-directory files before directory files.  The reason for this
 *      is because files tend to be in the same cylinder group as their
 *      parent directory, whereas directories tend not to be.  Copying the
 *      files first reduces seeking.
 */

This seems reasonable enough, but I think it would be good to preserve order
when all the arguments are files. This could be done at not great expense. I
can think of several ways:

(1) /usr/src/bin/cp/cp.c

Update mastercmp so that it falls back to comparing the argv[] indexes if
otherwise it would return 0. I thought the fts_number member of the FTSENT
structure could be used for this purpose, although it is currently being
used as a one-bit flag (pflag/dne). This flag could be moved to a high bit
instead.

(2) /usr/src/lib/libc/gen/fts.c

Before calling qsort, call the comparison function on each pair of items
in turn. If this returns -1 or 0 in every case, then the list is already
ordered and there is no need to call qsort(), which will unorder them. This
covers the common cases where all the sources are either all files or all
directories.

(3) replace the call to qsort() with a stable sort, e.g. mergesort(). I
think cp's mastercmp() will still need some tweaking in that case so that
two directories compare as equal, e.g.

        if (a_info == FTS_D)
                return (-1);
        if (b_info == FTS_D)
                return (1);
        return (0);

becomes

        if (a_info == FTS_D && b_info != FTS_D)
                return (-1);
        if (b_info == FTS_D)
                return (1);
        return (0);

Anyone have any thoughts on this?

Regards,

Brian.
Received on Wed Nov 16 2005 - 15:15:54 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:47 UTC