Re: polling's future [was: Re: Dynamic Ticks/HZ]

From: Fabien Thomas <fabien.thomas_at_netasq.com> Date: Tue, 6 Nov 2012 14:16:29 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:32 UTC

Le 6 nov. 2012 à 12:42, Andre Oppermann a écrit :

> On 06.11.2012 12:02, Fabien Thomas wrote:
>>>> 
>>> 
>>> Hi Luigi,
>>> 
>>> do you agree on polling having outlived its usefulness in the light
>>> of interrupt moderating NIC's and SMP complications/disadvantages?
>>> 
>> If you have only one interface yes polling is not really necessary.
>> 
>> If you have 10 interfaces the interrupt moderation threshold is hard to find
>> to not saturate the system.
>> Doing polling at 8000hz in that case is a lot better regarding global interrupt level.
> 
> OK.  Is the problem the interrupt load itself, or the taskqueues?

Both, interrupt load will be higher if you want to keep latency low and taskqueue 
is just polling without global fairness (if you have 10 interface with 6 core this will
give you 60 taskqueue). If you poll 16 packets at a time from each interface, 
processing are more fair.

> 
>> The problem is that in the current state polling does not work well and people remember
>> the good old time where polling was better.
> 
> Indeed.
> 
>> rstone_at_ and myself have made some improvement to polling.
>> 
>> You can find a diff here for 8.3 with updated intel driver :
>> http://people.freebsd.org/~fabient/polling/patch-pollif_8.3_11052012
>> 
>> - support multiqueue for ixgbe, igb, em.
>> - compat API for old driver
>> - keep interrupt for link / status
>> - user core mapping / auto mapping
>> - deadline to keep cpu available
>> - integrated to netisr
>> - deferred packet injection with optional prefetching
> 
> This is a number of interesting but sometimes only tangentially
> related features.  Lets focus on the network cpu monopolization
> issue first.

This is what deadline is:
Deadline is the maximum time spend  over the scheduling period in percent.
Scheduling period is a fraction of the polling period (100hz by default).
Each round is measured to estimate time of a round (if some packet require crypto
load will increase for example) and processing stop when the deadline is reached
(If no thread want to run deadline is extended).

Hope it is more clear.

Sample:
~$ sysctl kern.pollif
kern.pollif.map: 
kern.pollif.stats_clear: 0
kern.pollif.stats: 
Work queue 0:
CPU load      =   0 %
pass          = 80
run overflow  = 0
Interface ix1.0
  resched rx   = 0
Interface ix0.0
  resched rx   = 0

Work queue 1:
CPU load      =   0 %
pass          = 80
run overflow  = 0
Interface ix1.1
  resched rx   = 0
Interface ix0.1
  resched rx   = 0

Work queue 2:
CPU load      =   0 %
pass          = 80
run overflow  = 0
Interface ix1.2
  resched rx   = 0
Interface ix0.2
  resched rx   = 0

Work queue 3:
CPU load      =   0 %
pass          = 80
run overflow  = 0
Interface ix1.3
  resched rx   = 0
Interface ix0.3
  resched rx   = 0

kern.pollif.deadline: 80
kern.pollif.register_check: 10
kern.pollif.sched_div: 80
kern.pollif.packet_per_round: 16
kern.pollif.handlers: 8

> 
>> Performance are on par with interrupt but you can keep a system alive more easily
>> by accounting all network processing for the deadline (with direct dispatch).
> 
> Would you be willing to work a solution with me with a load aware
> taskqueue as I proposed in a recent email to Luigi?  That way we
> don't need special cases or features or even a normal server under
> DDoS wouldn't go down.

The main problem of current version I have is that you consume a little CPU when
idle (99.8% idle with top, < 0.5% with PMC using CPU_CLK_UNHALTED.THREAD_P).

To solve that, kickstarting the polling with interrupt is a good idea to reduce it
but i've never tested so why not. 

> 
> -- 
> Andre
>