Re: Is kern.sched.preempt_thresh=0 a sensible default?

From: Stefan Esser <se_at_freebsd.org>
Date: Thu, 5 Apr 2018 14:31:24 +0200
Am 04.04.18 um 18:45 schrieb Andriy Gapon:
> On 04/04/2018 16:19, Stefan Esser wrote:
>> I have identified the cause of the extremely low I/O performance (2 to 6 read
>> operations scheduled per second).
>>
>> The default value of kern.sched.preempt_thresh=0 does not give any CPU to the
>> I/O bound process unless a (long) time slice expires (kern.sched.quantum=94488
>> on my system with HZ=1000) or one of the CPU bound processes voluntarily gives
>> up the CPU (or exits).
>>
>> Any non-zero value of preemt_thresh lets the system perform I/O in parallel
>> with the CPU bound processes, again.
> 
> Let me guess... you have a custom kernel configuration and, unlike GENERIC
> (assuming x86), it does not have 'options PREEMPTION'?

Yes, thank you for pointing that out!!!

I used to have PREEMPTION and FULL_PREEMPTION in my kernel configuration,
and apparently have deleted both options when only FULL_PREEMPTION was
supposed to go ...


After looking at sched_ule.c and top/machine.c it appears, that the value
of preempt_thresh corresponds to the PRI value as shown by top (or ps -l)
plus PZERO which is calculated as (PRI_MIN_KERN=80) + 20.

What I do not understand, though, is that the decision about a preemption
is only based on the calculated new priority of the thread, but not at all
on the priority of other running threads (except the idle thread).

On my system, a "real" batch job (i.e. one that does not voluntarily give
up the CPU due to I/O) seems to have a PRI value of 80 to 100 (growing
over time), while an interactive process has a PRI of 20, a maximally
"niced" interactive process has 52.

So, I'd expect a reasonable default value of preempt_thresh to be slightly
above 120 (e.g. 124) to prevent I/O heavy threads from stealing each other
the CPU too often, and to prevent "niced" processes from doing the same ...

The two values configured into the kernel (80 for PREEMPTION and 255 for
FULL_PREEMPTION) seem to be extremes, but something in between (e.g. 124)
is not offered (can only be configured via sysctl without any information
for the correspondence between the threshold value and the PRI value in
any document I've found, besides the kernel sources ...).


Is PRI_MIN_KERN=80 really a good default value for the preemption threshold?

Regards, STefan
Received on Thu Apr 05 2018 - 10:40:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC