Re: Let's use gcc-4.2, not 4.1 -- OpenMP

From: Stefan Ehmann <shoesoft_at_gmx.net>
Date: Fri, 15 Dec 2006 19:14:53 +0100
On Friday 15 December 2006 12:50, Stefan Ehmann wrote:
> On Friday 15 December 2006 06:43, Scott Long wrote:
> > Steve Kargl wrote:
> > > On Fri, Dec 15, 2006 at 02:50:30PM +1030, Daniel O'Connor wrote:
> > >>On Friday 15 December 2006 05:50, Scott Long wrote:
> > >>>Yes, the industry moves fast, but that's no reason to fool ourselves
> > >>>into thinking that the FSF will support GCC 4.2 a day after they
> > >>> release 4.3 and start working on 4.4.  Your point above about the
> > >>> lifespan of FreeBSD 7.x is a valid one, and I agree that it should be
> > >>> a
> > >>>consideration.  Vendor support is a myth and should not be a
> > >>>consideration.
> > >>
> > >>Not to mention it is *trivial* to install a compiler using ports or
> > >> packages.
> > >>
> > >>If you are serious about high performance computing installing a new
> > >> compiler is about the lowest barrier you'll find.
> > >
> > > Actually, 4.1.x will produce much worse code than 3.4.6.
> > > You can search the gcc mail listings for extensive comparison
> > > by Clinton Whaley (the author of math/atlas) for details.
> >
> > Has this been fixed in GCC 4.2?  If the FSF claims to have fixed it,
> > has it been actually verified?  I thought that gcc 4 was supposed to
> > solve the world's problems with vectorization.
>
> I've been playing around with optimizations for a small cpu-intensive
> program (only integer, no FP) for a course some time ago and tested
> different gcc versions. gcc-3.4 (with -O3 -march=pentium4) won over gcc-4.0
> there.
>
> My new test setup:
> FreeBSD 6.2-RC1
> gcc version 3.4.6 [FreeBSD] 20060305 (base system)
> gcc version 4.1.2 20061013 (prerelease) (lang/gcc41 package)
> gcc version 4.2.0 20061014 (experimental) (lang/gcc42 package)
>
> CPU: AMD Athlon(TM) XP 2700+ (2166.44-MHz 686-class CPU)
> Instructions counted with
> pmcstat -C -p k7-retired-instructions
>
> Settings/Compiler           | gcc-3.4 | gcc-4.1 | gcc-4.2
> ----------------------------+---------+---------+---------
> -O2                         |  13.1bn |  13.8bn |  13.5bn
> -O2 -funroll-loops          |   9.6bn |   9.3bn |   9.2bn
> -O2 -march=athlon-xp -fun.. |   9.7bn |  10.6bn |  10.7bn
> -O3                         |  11.5bn |   9.5bn |   9.6bn
> -O3 -funroll-loops          |   8.4bn |   9.2bn |   9.4bn
> -O3 -march=athlon-xp -fun.. |   8.8bn |  10.6bn |  11.1bn
>
>
> I'm aware that testing with a single program is not too meaningful, but it
> might give a hint at least.

Ok, forget these numbers. I wanted to measure cycles, not instructions (since 
less instructions don't necessarily mean less running time). Additonally, I 
accidently used -march=pentium4 instead of -march=athlon-xp.

hwpmc doesn't seem to provide cycle statistics, so I'm falling back to time 
(using user+sys). 20 runs, standard deviation is <= 0.01 for all values.

Ok, another try that makes gcc-4.2 look quite good.

Settings/Compiler           | gcc-3.4 | gcc-4.1 | gcc-4.2
----------------------------+---------+---------+---------
-O2                         |   6.46s |   6.67s |   6.38s
-O2 -funroll-loops          |   4.44s |   4.16s |   4.02s
-O2 -march=athlon-xp -fun.. |   4.39s |   4.38s |   4.26s
-O3                         |   6.14s |   5.23s |   5.16s
-O3 -funroll-loops          |   4.24s |   4.87s |   4.95s
-O3 -march=athlon-xp -fun.. |   4.19s |   4.90s |   5.07s
Received on Fri Dec 15 2006 - 17:14:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:03 UTC