BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Thu, 14 Jun 2007 04:48:17 -0400
I have been benchmarking BIND 9.4.1 recursive query performance on an
8-core opteron, using the resperf utility (dns/dnsperf in ports).  The
query data set was taken from www.freebsd.org's httpd-access.log with
some of the highly aggressive robot IP addresses pruned out (to avoid
huge numbers of repeated queries against a small subset of addresses,
which would skew the results).

Testing was done over a broadcom gigabit ethernet cable connected
back-to-back between two identical machines.  named was restarted in
between tests to flush the cache.  resperf is designed to slowly
increase the query rate over a period of 60 seconds, up to a maximum
query rate, to determine the point at which the server starts to fall
behind on answering queries.  To more accurately measure this point,
in each case I tuned the maximum query rate so that the server fell
behind after around 50 seconds of load.

7.0 was used with up-to-date CVS sources and the SCHED_SMP (enhanced
SMP) scheduler, which is not yet committed but for which patches have
been posted by Jeff Roberson.  Actually this did not make much
difference compared to ULE on this workload, although I didn't graph
ULE.  BIND 9.4.1 from the base system was used for the threaded
version, and the bind94 port with threads disabled for comparison.
All debugging was disabled.

6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is
broken in 6.x).  In addition I also tested a previously posted patch
from rwatson that may be found here:

  www.watson.org/~robert/freebsd/netperf/20070311-sosend_dgram.diff

The results show several interesting things:

  http://obsecurity.dyndns.org/bind-resperf.png

Firstly, 7.0 beats 6.2 across the board, and has about 60% higher peak
performance.  BIND does not scale beyond 4 worker threads, but this
appears to be due to high contention on pthread mutexes in userland,
i.e. a BIND design problem rather than a FreeBSD kernel problem.
There is moderate UDP contention that, if it can be optimized, might
increase peak performance but is not likely to improve scaling.  For
now it appears that BIND 9.4 does not scale to >4 CPUs.

FreeBSD 6.2 seems to have at least two major performance bottlenecks,
due to file descriptor locking, and poor scaling of the old sx lock
implementation (both have been fixed in 7.0).  I actually don't know
what is using the sx locks so heavily in 6.2, there does not appear to
be an analogue on the 7.0 lock profile.  There are other optimizations
in 7.0 that are probably responsible for a smaller part of the
difference.

Robert's patch gives a modest boost to 6.2 at light concurrency but is
swamped by the other scaling problems at high load.  The graph should
not be interpreted as showing that this patch performs worse at high
load; the variance is so enormous that it is easily consistent with
the CVS data.

It would be interesting to test BIND performance when acting as an
authoritative server, which probably has very different performance
characteristics; the difficulty there is getting access to a suitably
interesting and representative zone file and query data.

Kris


Received on Thu Jun 14 2007 - 06:48:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:12 UTC