BSDtar performance vs GNUtar (Re: cvs commit: src/usr.bin/tar Makefile bsdtar.c bsdtar.h bsdtar_platform.h config_freebsd.h getdate.y matching.c read.c tree.c util.c write.c src/usr.bin/tar/test config.sh test-acl.sh test-basic.sh test-deep-dir.sh test-flags.sh test-nodump.sh ...)

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Sun, 11 Mar 2007 20:10:26 -0400
On Sun, Mar 11, 2007 at 01:12:01PM -0700, Tim Kientzle wrote:
> >kientzle    2007-03-11 10:36:43 UTC
> >  FreeBSD src repository
> >  bsdtar 2.0.23:
> >     * read.c now relies on security checks in libarchive instead
> >       of trying to do its own...
> 
> Bsdtar should now be considerably faster than before.
> I put a lot of effort over the last few months into
> streamlining the code in libarchive to recreate objects
> on disk.
> 
> I'd appreciate any feedback on the performance of this latest
> bsdtar when restoring archives.  I'm particularly interested in
> performance compared to GNU tar for uncompressed archives
> with and without the "-P" option (which disables the security
> checks).

This is extracting a ~1GB copy of the ports tree (also containing some
other cruft like distfiles and some work directories), to an async
swap backed md, which was destroyed and recreated in between runs.

The first archive was created with bsdtar (tar cvf ports.tar ports)
which made gtar bitch a bit about unknown options (SCHILY.*) when
extracting it.  This did not seem to affect peformance though, as I
confirmed by using gtar to recreate the archive itself and then timing
that.

Extracting with -P:

x gtar-real
+ bsdtar-real
+------------------------------------------------------------+
|             +                                              |
|             +   + +             x      x                   |
|+          + +  ++ +    x  x     x  x x x     x            x|
|        |_____AM___|        |________A_________|            |
+------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10         23.85         28.92          25.7        25.816     1.4355115
+  10         20.41         23.14        22.535        22.409    0.79712609
Difference at 95.0% confidence
        -3.407 +/- 1.09092
        -13.1972% +/- 4.22577%
        (Student's t, pooled s = 1.16106)

i.e. bsdtar has gone from being about 40% slower than gtar to ~13%
faster than it (system time is also proportionally lower on bsdtar).

Extracting without -P does not show a statistically significant
difference with gtar, but bsdtar is slightly slower:

x bsdtar-real-P
+ bsdtar-real-noP
+------------------------------------------------------------+
|                                x                  +   +    |
|x                           x   x       xxx  xx    +   +   +|
|                    |_____________A_M__________|   |__AM__| |
+------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10         20.41         23.14        22.535        22.409    0.79712609
+   5         23.47         23.92         23.68         23.65    0.18854708
Difference at 95.0% confidence
        1.241 +/- 0.794373
        5.53795% +/- 3.54488%
        (Student's t, pooled s = 0.671444)

It is still clearly faster than gtar (though not by as much):

x gtar-real-noP
+ bsdtar-real-P
+------------------------------------------------------------+
|+                                                           |
|+    ++x   +                       x     xx                x|
||____A___|        |__________________A___M______________|   |
+------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5         23.75         25.93         25.18        24.996    0.79245189
+   5         23.47         23.92         23.68         23.65    0.18854708
Difference at 95.0% confidence
        -1.346 +/- 0.840049
        -5.38486% +/- 3.36073%
        (Student's t, pooled s = 0.57599)

Excellent work!

Kris


Received on Sun Mar 11 2007 - 23:10:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:06 UTC