On Wed, Aug 15, 2007 at 11:04:29AM +0200, Erik Cederstrand wrote: > 1) Which benchmarks would you like to see being run? > 2) Which tests do you perform regularly, which the tracker could automate? > 3) Which features in the web interface would you find most helpful? Here's what Robert Watson last posted on this subject (on freebsd-arch_at_). I hope that he doesn't mind the re-post. Date: Wed, 4 Jul 2007 12:58:44 +0100 (BST) From: Robert Watson <rwatson_at_FreeBSD.org> In-Reply-To: <20070704105525.GU45894_at_elvis.mu.org> Message-ID: <20070704124833.W37059_at_fledge.watson.org> References: <20070702230728.E552_at_10.0.0.1> <20070703181242.T552_at_10.0.0.1> <20070704105525.GU45894_at_elvis.mu.org> Cc: arch_at_freebsd.org I also worry about the narrowness of the benchmarking we're doing -- however, it's hardly new. We do best at optimizing where we have clearly defined targets and measures of performance. The four-times increase in MySQL select performance is a direct result of Kris taking on scalability measurement and helping developers with optimization ideas try them out, profile them, etc. A point I've made at a number of devsummits and elsewhere is that what we really need now is more people to "take ownership" of the performance of workloads they care about. They don't need to be the people to do the optimizations, but if they could help manage outstanding patchsets, measure the change in performance over time, get involved in profiling, etc, then that will have a big effect on performance for the workload, as has happened with MySQL. Here are some workloads I'd really like to see people take responsibility for: - Flat file Apache performance, perhaps with Apachebench or another HTTP throughput measurement tool. - Dynamic Apache performance, perhaps using some combination of Apache/php/MySQL. - BIND query performance with a few realistic-looking workloads. - PostgreSQL performance along the same lines as current MySQL performance. Kris has waved his hands a bit in this direction already and much of the MySQL measurement work can be reused. - Some sort of compiler/build/etc test -- buildworld of HEAD tends to be highly variable over time as components change, compilers change, etc, but optimizing build performance still has a big benefit for developers. Perhaps how long it takes to do the post-buildtools bit of buildworld for a fixed FreeBSD version. - Network micro-benchmarks, including loopback TCP and UDP, multi-machine TCP and UDP, both single stream and multi-stream. - UI interactivity testing -- how long it takes to go from a simultaned keypress from the keyboard device to an input program running in an xterm and other related latency tests that will be affected by scheduling, IPC, and so on. There seem to be two parts of owning a benchmark: - Establishing baselines over time -- how doe FreeBSD 4.8, 5.5, 6.0, 6.1, 6.2, 6-STABLE weekly, 7-CURRENT weekly, and maybe a Linux or NetBSD version perform for the workload using otherwise identical configuration. - Measurement and feedback -- identifying bottlenecks, working with developers to measure the results of specific optimizations, etc, across the life cycle of the patch. If Kris can motivate such a dramatic improvement in MySQL performance, it seems likely that people doing similar things with other workloads could have similar effects. And, as you say, breadth is really important -- tuning the system for MySQL is very important, but has it generally hurt or helped other workloads? In most cases, I'd expect work to date to have helped, because it involved lowering overhead, etc. However, when we get into schedulers, space/time trade-offs, and so on, then that balance will become harder to strike. Robert N M Watson Computer Laboratory University of CambridgeReceived on Wed Aug 15 2007 - 16:51:09 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:16 UTC