Re: CURRENT as gateway on not-so-fast hardware: where is a bottlneck?

From: Doug Barton <dougb_at_FreeBSD.org>
Date: Mon, 20 Aug 2012 01:32:54 -0700
On 08/15/2012 03:18, Alexander Motin wrote:
> On 15.08.2012 03:09, Doug Barton wrote:
>> On 08/14/2012 12:20 PM, Adrian Chadd wrote:
>>> Would you be willing to compile a kernel with KTR so you can capture
>>> some KTR scheduler dumps?
>>>
>>> That way the scheduler peeps can feed this into schedgraph.py (and you
>>> can too!) to figure out what's going on.
>>>
>>> Maybe things aren't being scheduled correctly and the added latency is
>>> killing performance?
>>
>> You might also try switching to SCHED_ULE to see if it helps.
>>
>> Although, in the last few months as mav has been converging the 2 I've
>> started to see the same problems I saw on my desktop systems previously
>> re-appear even using ULE. For example, if I'm watching an AVI with VLC
>> and start doing anything that generates a lot of interrupts (like moving
>> large quantities of data from one disk to another) the video and sound
>> start to skip. Also, various other desktop features (like menus, window
>> switching, etc.) start to take measurable time to happen, sometimes
>> seconds.
>>
>> ... and lest you think this is just a desktop problem, I've seen the
>> same scenario on 8.x systems used as web servers. With ULE they were
>> frequently getting into peak load situations that created what I called
>> "mini thundering herd" problems where they could never quite get caught
>> up. Whereas switching to 4BSD the same servers got into high-load
>> situations less often, and they recovered on their own in minutes.
> 
> It is quite pointless to speculate without real info like mentioned
> above KTR_SCHED traces.

I'm sorry, you're quite wrong about that. In the cases I mentioned, and
in about 2 out of 3 of the cases where users reported problems and I
suggested that they try 4BSD, the results were clear. This obviously
points out that there is a serious problem with ULE, and if I were the
one who was responsible for that code I would be looking at ways of
helping users figure out where the problems are. But that's just me.

> Main thing I've learned about schedulers, things
> there never work as you expect. There are two many factors are relations
> to predict behavior in every case.

In the web hosting case that I mentioned, I purposely kept every other
factor consistent; and changed only s/ULE/4BSD/. The results were both
clear and consistent.

> What's about playing AVIs and using other GUIs, key word here and for
> ULE in general is interactivity. ULE gives huge boost to threads it
> counts interactive.

I'm not using ULE. I haven't for over a year. Sorry if I wasn't clear.

> If somebody still wish area for experiments, there is always some:
>  - if you want video player to not lag, set negative nice for it (ULE is
> not a magician to guess user wishes);

At the same time, I don't have these problems on my Linux systems, and I
don't need to adjust anything. Not to mention that given how web servers
are one of our main server implementations, the fact that we have what
seems to be a serious performance problem with out default scheduler in
that use case seems like an issue that we would want to address.

Doug

-- 

    I am only one, but I am one.  I cannot do everything, but I can do
    something.  And I will not let what I cannot do interfere with what
    I can do.
			-- Edward Everett Hale, (1822 - 1909)
Received on Mon Aug 20 2012 - 06:32:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:29 UTC