On Tue, 2 Mar 2004, Robert Watson wrote: > On Mon, 1 Mar 2004, Robert Watson wrote: > > > FYI, I now have access to a build box at work with two Xeon 2.4GHz > > processors, each with two logical CPUs, and 1GB of memory. Here are the > > buildworld times, with -DNORESCUE and -DNOPROFILE, 5.2.1-RELEASE > > GENERICish kernel (no WITNESS, INVARIANTS): [Reformatted without tabs and with columns aligned] > > > > Real User Sys > > default 2195.16 1717.69 467.78 > > -j 2 2003.20 2151.49 539.67 > > -j 4 1703.15 2485.99 654.00 > > -j 6 1645.34 2595.67 718.12 > > -j 8 1627.88 2618.15 743.53 > > As a follow-up, this was with SCHED_4BSD, which is the default in 5.2.1. > Here's 5.2.1 with SCHED_ULE on the same hardware: > > Real User Sys > default 2191.03 1722.31 455.82 > -j 2 1993.30 2154.71 528.67 > -j 4 1688.14 2493.55 646.69 > -j 6 1630.02 2597.88 706.06 > -j 8 1617.72 2619.99 737.98 > > I should prefix a bit of interpretation by noting that SCHED_ULE has been > changed substantially since 5.2.1. What's interesting about these numbers > is that in the non-parallel case (default), we see moderately better > performance, and substantially less user or system time -- assuming the > time measurements are consistent between the two schedulers. At -j 2, > we're paying a moderate overhead in wall time for using ULE, but seeing > better utilization of resources. At -j 4, we pass a threshold and are > breaking about even, which we continue to do through -j 8. Erm, the above shows that 5.2.1+ULE is faster in wall time in all cases, with the largest benefits at -j4 and -j6, and the the smallest benefits at <default>. The benefits are very small, however. Not enough to justify a new scheduler even on a machine that should benefit more than most from better scheduling. > So a couple of interesting questions to answer would be: > > (1) Are the utilization times between 4BSD and ULE directly comparable? > Do we believe that they are both accurate? Real time is directly comparable, but the user/sys time split is difficult to compare and not very important either. The placement of page zeroing probably affects the relative split more than anything (with fewer jobs, the pagezero daemon gets more chances to run and it sometimes does useful work that would otherwise be counted as system time). > (2) If we reran these tests with 5.2-CURRENT, how would the numbers > change? I would be surprised if they changed much. buildworld is mostly a gcc cpu hog benchmark, and about the only significant thing the kernel can do to speed up gcc is to reduce its memory contention. > (3) What is a nice rationalization for going from "using less resources" > to "slower compile". Using tabs so that the columns are not lined up and the data is misinterpreted ;-). BruceReceived on Tue Mar 02 2004 - 06:20:40 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:45 UTC