Re: Why we don't use bzip2 in sysinstall/rescue?

From: Peter Jeremy <peterjeremy_at_optushome.com.au>
Date: Mon, 20 Aug 2007 20:34:24 +1000
On 2007-Aug-19 16:46:10 -0700, Jeff Roberson <jroberson_at_chesapeake.net> wrote:
>I tried this on my 1.8ghz pentium M laptop with 5.6MB of jpg data.
>
>I did:
>
>tar cvf foo.tar foo
>cat foo.tar >> /dev/null
>time bzip2/gzip foo.tar
>
>I removed and recreated the tar each time.  The cat was to make sure it was 
>in cache, although it certainly was from the creation step before.
>
>Anyway, the results are:
>
>bzip2
>2.452u 0.026s 0:07.65 32.2% 92+3227k 5+43io 0pf+0w 1849c/6w
>
>gzip
>0.539u 0.020s 0:01.75 31.4% 109+3268k 2+44io 0pf+0w 493c/3w

I don't believe this is a reasonable test because:
1) You are measuring compression time, whilst it's decompression time
   that is relevant to installation.
2) jpeg images should not be compressible and are not representative
   of the type of data in a FreeBSD release.

I've tried what I believe is a more reasonable benchmark on an
Athlon XP-1800, running a recent 7-CURRENT using all the installation
images in 6.2-RELEASE-i386-disk1.iso.

I concatenated all the 6.2-RELEASE/*/*.?? parts into */*.tgz files as
well as copying ports.tgz (a total of 31 files).  I also decompressed
each file and recompressed it into a bzip2 file.  The total sizes
were:
*/*.tbz: 237717490
*/*.tgz: 281754511

Like you, I used "cat */*.t{g,b}z >/dev/null" to cache the files
and use systat to verify that they were cached.

Timing the gzcat and bzcat runs gives:
gzcat -v */*.tgz > /dev/null  12.01s user 0.88s system 98% cpu 13.115 total
gzcat -v */*.tgz > /dev/null  11.95s user 0.95s system 98% cpu 13.124 total
gzcat -v */*.tgz > /dev/null  11.96s user 0.91s system 98% cpu 13.092 total
bzcat -v */*.tbz > /dev/null  153.29s user 3.43s system 98% cpu 2:39.03 total
bzcat -v */*.tbz > /dev/null  153.32s user 3.26s system 98% cpu 2:39.14 total
bzcat -v */*.tbz > /dev/null  153.16s user 3.48s system 98% cpu 2:39.02 total

This is nearly 13:1 slower for bzcat, with a size reduction of about 15%.

As for the CPU vs I/O tradeoff, I believe that gzcat will be I/O bound
whilst bzcat will be CPU bound in most situations, though I haven't
actually verified this.

-- 
Peter Jeremy

Received on Mon Aug 20 2007 - 08:34:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:16 UTC