On Fri, 20 Apr 2012, K. Macy wrote: > On Fri, Apr 20, 2012 at 4:44 PM, Luigi Rizzo <rizzo_at_iet.unipi.it> wrote: >> The small penalty when flowtable is disabled but compiled in is >> probably because the net.flowtable.enable flag is checked >> a bit deep in the code. >> >> The advantage with non-connect()ed sockets is huge. I don't >> quite understand why disabling the flowtable still helps there. > > Do you mean having it compiled in but disabled still helps > performance? Yes, that is extremely strange. This reminds me that when I worked on this, I saw very large throughput differences (in the 20-50% range) as a result of minor changes in unrelated code. I could get these changes intentionally by adding or removing padding in unrelated unused text space, so the differences were apparently related to text alignment. I thought I had some significant micro-optimizations, but it turned out that they were acting mainly by changing the layout in related used text space where it is harder to control. Later, I suspected that the differences were more due to cache misses for data than for text. The CPU and its caching must affect this significantly. I tested on an AthlonXP and Athlon64, and the differences were larger on the AthlonXP. Both of these have a shared I/D cache so pressure on the I part would affect the D part, but in this benchmark the D part is much more active than the I part so it is unclear how text layout could have such a large effect. Anyway, the large differences made it impossible to trust the results of benchmarking any single micro-benchmark. Also, ministat is useless for understanding the results. (I note that luigi didn't provide any standard deviations and neither would I. :-). My results depended on the cache behaviour but didn't change significantly when rerun, unless the code was changed. BruceReceived on Sat Apr 21 2012 - 04:34:23 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:26 UTC