On Mon, Sep 27, 2010 at 09:52:21AM +0100, Robert Watson wrote: > One reason I haven't merged the earlier patch is that many high-performance > 10gbps (and even 1gbps) cards now support multiple input queues in hardware, > meaning that they have already done the work distribution by the time the > packets get to the OS. This makes the work distribution choice quite a bit > harder: has a packet already been adequately balanced, or is further > rebalancing required -- and of so, an equal distribution as selected in that > patch might not generate well-balanced CPU load. > > Using just the RSS hash to distribute work, and single-queue input, I am able > to get doubled end-host TCP performance with highly concurrent connections at > 10gbps, which is a useful result. I have high on my todo list to get the > patch you referenced into the mix as well and see how much the software > distrbiution hurts/helps... Thanks for explanation. > Since you've done some measurement, what was the throughput on that system > without the patch applied, and how many cores? The server has four cores. Topology: <groups> <group level="1" cache-level="0"> <cpu count="4" mask="0xf">0, 1, 2, 3</cpu> <children> <group level="3" cache-level="2"> <cpu count="4" mask="0xf">0, 1, 2, 3</cpu> </group> </children> </group> </groups> Without patch i have only one netisr thread utilization with 100% cpu load and ~90% packets drop at max 80-90Kpps. The throughput oscillated from 2MB/s to 30MB/s. Cores 0,2,3 - netisr with cpu binding Core 1 - irq256 (bge0) bind via cpuset(1) P.S.: bge(4) patched for agressive interrupt moderation. Without this i have 11K+ int/sec and ~99% cpu usage only in the interrupt handling.Received on Mon Sep 27 2010 - 13:33:52 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:07 UTC