On Friday 23 January 2004 07:34 am, Bruce Evans wrote: > On Wed, 21 Jan 2004, John Baldwin wrote: > > On Tuesday 20 January 2004 11:58 pm, Bruce Evans wrote: > > > i386 (or equivalently, no special tuning) is the best default, at least > > > in non-FPU-intensive applications. In my integer crunching > > > application/ benchmark (searching a game tree), it even gives better > > > results than -mcpu=pentiumpro on a pentiumpro class machine (a 366MHz > > > Celeron). -mcpu=athlon-xp gives even better results. > > > > > > All with -O3 -fomit-frame-pointer > > > -mcpu-athlon-xp 48.42 real 47.31 user 0.41 sys > > > 51.22 real 50.10 user 0.30 sys > > > -mcpu=i386 51.98 real 50.18 user 0.34 sys > > > -mcpu=pentiumpro 56.38 real 55.26 user 0.34 sys > > > -mcpu=pentium2 56.24 real 55.25 user 0.36 sys > > > -mcpu=pentium3 56.59 real 55.25 user 0.40 sys > > > -mcpu=pentium4 58.52 real 56.96 user 0.36 sys > > > -mcpu=i486 79.17 real 77.69 user 0.32 sys > > > -mcpu=i586 74.80 real 73.07 user 0.48 sys > > > > > > This is just one benchmark, chosen for its potential optimizability. > > > I only did non-exhaustive benchmarks for the makeworld benchmark. I > > > removed the -mpentiumpro change when I saw the kernel size bloat that > > > it gave. > > > > Does -mcpu=althon-xp perform worse than the default in other benchmarks > > that you've run? > > I haven't run enough to be sure. It's hard to test all the combinations > for long enough. Some quick tests with the cc1 application/benchmark: > > cc1 compiled with -O3 -fomit-frame-pointer, and: > -mcpu=i386 (code o3) > -mcpu=i486 (o4) > -mcpu=pentiumpro (op) > -mcpu=athlon-xp (oa) > Times for the "all" part of "make obj; make depend; make all" starting > with an empty object tree and source tree = src/bin on the Celeron and > src/usr.sbin on the Athlon (it doesn't complete because it wants to > link to never-installed unbuilt libraries, but it gets a fair way). > Smallest real time for 2 runs: > > On a Celeron 400 with source tree src/bin: > o3: 121.94 real 97.14 user 19.94 sys > o4: 130.83 real 106.59 user 19.07 sys > oa: 122.69 real 97.58 user 19.39 sys > op: 124.01 real 99.54 user 19.56 sys > All non-null -mcpu settings are pessimizations, with -mcpu=i486 > significantly bad and -mcpu=pentiumpro probably significantly bad. > Optimizing the pentiumpro class machine as an athlon-xp works > better (less worse here) than optimizing it as a pentiumpro in this > benchmark too, but the differences are smaller > > On an Athlon-XP1600 overclocked with source tree src/usr.sbin: > o3: 67.62 real 57.46 user 9.53 sys > o4: 69.09 real 57.65 user 10.20 sys > oa: 67.53 real 56.78 user 9.62 sys > op: 68.14 real 57.47 user 9.70 sys > Most of the differences are too small to be significant. Optimizing > the athlon-xp as an athlon-xp at least doesn't pessimize it. > > My integer-crunching benchmark shows similarly small differences on > freefall, but that may be just because freefall's gcc is so old. Hmm, well, I'm ok with dropping the mcpu=ppro from bsd.cpu.mk for the default case then. -- John Baldwin <jhb_at_FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.orgReceived on Fri Jan 23 2004 - 10:25:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:39 UTC