On Thu, Mar 12, 2015 at 1:36 PM, Mateusz Guzik <mjguzik_at_gmail.com> wrote: > Workloads like buildworld and the like (i.e. a lot of forks + execs) run > into very severe contention in vm, which is orders of magnitude bigger > than anything else. > > As such your result seems quite suspicious. > You're right, I did mess up the testing somewhere (I have no idea how). As you suggested, I switched to using a separate partition for the objdir, and ran each build with a freshly newfsed filesystem. I scripted it to be sure that I was following the same procedure with each run: # Build known-working commit from head git checkout 09be0092bd3285dd33e99bcab593981060e99058 || exit 1 for i in `jot 5` do # Create a fresh fs for objdir sudo umount -f /usr/obj 2> /dev/null sudo newfs -U -j -L OBJ $objdev || exit 1 sudo mount $objdev /usr/obj || exit 1 sudo chmod a+rwx /usr/obj || exit 1 # Ensure disk cache contains all source files git status > /dev/null /usr/bin/time -a -o $logfile make -s -j$(sysctl -n hw.ncpu) buildworld buildkernel done I tested on the original 12-core machine, as well as a 2 package x 8 core x 2 HTT (32 logical cores) machine that a co-worker was able to lend me. Unfortunately, the results show a performance decrease now. It's almost 5% on the 32 core machine: $ ministat -w 74 -C 1 12core/* x 12core/orig.log + 12core/rmlock.log +--------------------------------------------------------------------------+ |x xx x x + + + + +| | |_________A__________| |_______________A___M__________|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 2478.81 2487.74 2483.45 2483.652 3.2495646 + 5 2489.64 2501.67 2498.26 2496.832 4.7394694 Difference at 95.0% confidence 13.18 +/- 5.92622 0.53067% +/- 0.238609% (Student's t, pooled s = 4.06339) $ ministat -w 74 -C 1 32core/* x 32core/orig.log + 32core/rmlock.log +--------------------------------------------------------------------------+ |x x + | |x x x + ++ +| ||__AM| |_______AM_____| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 1067.97 1072.86 1071.29 1070.314 2.2238997 + 5 1111.22 1129.05 1122.3 1121.324 6.4046569 Difference at 95.0% confidence 51.01 +/- 6.99181 4.76589% +/- 0.653249% (Student's t, pooled s = 4.79403) The difference is due to a significant increase in system time. Write locks on an rmlock are extremely expensive (they involve an smp_rendezvous), and the cost likely scales with the number of cores: x 32core/orig.log + 32core/rmlock.log +--------------------------------------------------------------------------+ |xxx x + +++ +| ||_MA__| |____MA______| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 5616.63 5715.7 5641.5 5661.72 48.511545 + 5 6502.51 6781.84 6596.5 6612.39 103.06568 Difference at 95.0% confidence 950.67 +/- 117.474 16.7912% +/- 2.07489% (Student's t, pooled s = 80.5478) At this point I'm pretty much at an impasse. The real-time behaviour is critical to me, but a 5% performance degradation isn't likely to be acceptable to many people. I'll see what I can do with this.Received on Fri Mar 13 2015 - 14:23:08 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:56 UTC