Process timing issue

From: Jerome Flesch <jerome.flesch_at_netasq.com>
Date: Mon, 21 Feb 2011 17:24:05 +0100
Hello,

While investigating a timing issue with one of our program, we found out 
something weird: We've written a small test program that just calls 
clock_gettime() a lot of times and checks that the time difference 
between calls makes sense. In the end, it seems it doesn't always do.

Calling twice in a row clock_gettime() takes usually less than 1ms. But 
with an average load, about 1 time in 200000, more than 10ms are spent 
between both calls for no apparent reason. According to our tests, when 
it happens, the time between both calls can go from few milliseconds to 
many seconds (our best score so far is 10 seconds :). Same goes for 
gettimeofday().

To reproduce this issue, we use the small test program joined to this 
mail and openssl speed test commands (usually 'openssl speed -elapsed 
dsa2048'). It only appears if one of these openssl speed test run on the 
same core than our test progam, so we have to start as many openssl 
instances as there are cores on the test computer.

For instance:
- FreeBSD 7.3 + Core 2 duo
- FreeBSD 7.4/STABLE + Core 2 duo
- FreeBSD 8.1 + Core 2 duo
- FreeBSD 8.2-RC3 + Core 2 duo (clean setup)
- FreeBSD 9.0/CURRENT + Core 2 duo (clean setup, checked out last Friday)
      On all these setups, with 2 instances of openssl speed tests, on 
average, for 2000000*2 calls to clock_gettime(), we got about 10 pauses 
of between 100 and 300ms

- FreeBSD 8.2-RC1 + Xeon W3680 (6 cores):
      With 12 instances of openssl speed tests, for 10000000*2 calls to 
clock_gettime(), we got pauses of between 300ms and 10s.

- FreeBSD 6.3 + VIA C3:
      With 1 instance of openssl, we got pauses of between 70ms and 300ms

- FreeBSD 7.3 + VIA C7:
      With 1 instance of openssl, we got pauses of between 150ms and 1s 
(Note: this computer may be slightly more loaded then the previous one)


For comparison purpose, we also tried on 2 Linux systems:
- Linux 2.6.32.15 (Via nano, with a bunch of services running on it):
    - 1 openssl speed tests + 2000000 iterations: max 10ms
    - 10 openssl speed tests + 2000000 iterations: max 44ms
- Linux 2.6 (Core i7):
    - 10 openssl speed tests + 2000000: max 1ms


We tried setting the test program to the highest priority possible 
(rtprio(REALTIME, RTP_PRIO_MAX)) and it doesn't seem to change a thing.


Does anyone know if there is a reason for this behavior ? Is there 
something that can be done to improve things ?



Received on Mon Feb 21 2011 - 15:43:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:11 UTC