On Fri, May 06, 2005 at 11:35:29AM -0700, Kris Kennaway wrote: > Here are my benchmark numbers for parallel tarball extraction > with/without mpsafevfs on a 12-processor E4500 running up-to-date 6.0. > Kernel was built without INVARIANTS and other debugging options, > without ADAPTIVE_GIANT (which causes about a 200% performance penalty > on system time in my testing, and has marginal impact on real or user > time) and with 4BSD scheduler (ULE causes spontaneous reboots on this > machine). The e4500 uses the esp SCSI controller, which runs > without Giant. I tried with ULE on a quad amd64 machine, which was stable enough to perform the extraction tests. Here is the data for one tarball extraction to md: x real.one.4bsd + real.one.ule +--------------------------------------------------------------------------+ | + | | + | | + | |+ + + + x | |+ + + + x x x x x x x x| | |MA_| |___MA___| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 9 2.43 2.5 2.46 2.4644444 0.022973415 + 11 2.15 2.18 2.16 2.1636364 0.010269106 Difference at 95.0% confidence -0.300808 +/- 0.0161686 -12.2059% +/- 0.656073% (Student's t, pooled s = 0.0171217) ...so ULE is 12% faster at extracting a single tarball to md 12 concurrent extractions: x real.4bsd + real.ule +--------------------------------------------------------------------------+ | x x + + | |x x x x xx x x + + + ++ + + + +| | |____AM___| |______MA_______| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 13.56 14.43 14.09 14.02 0.2641969 + 11 15.95 17.34 16.49 16.570909 0.40957184 Difference at 95.0% confidence 2.55091 +/- 0.318571 18.1948% +/- 2.27226% (Student's t, pooled s = 0.348356) ...but 18% slower with 12 concurrent extractions. The effective concurrency under 4BSD is 2.11 (same asymptote as on the 12-processor sparc64, suggesting something universal like VFS locking is the limitation) but under ULE it is only 1.57. With 4 concurrent extractions (= # CPUs) x real.4.4bsd + real.4.ule +--------------------------------------------------------------------------+ | x + | |x xxx x x x * x + + + ++ + + +| | |____________AM____________| |_______A_M_____| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 4.65 5.44 5.17 5.157 0.22400645 + 10 5.41 5.87 5.69 5.657 0.1374409 Difference at 95.0% confidence 0.5 +/- 0.174609 9.69556% +/- 3.38587% (Student's t, pooled s = 0.185834) ULE is still slower. This suggests that at the present time (and apart from the known instabilities) ULE may be better for filesystem performance on lightly loaded systems, but it degrades worse than 4BSD under concurrent filesystem load. Kris
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:34 UTC