Re: CURRENT as gateway on not-so-fast hardware: where is a bottlneck?

From: Doug Barton <dougb_at_FreeBSD.org>
Date: Mon, 20 Aug 2012 03:25:37 -0700
On 08/20/2012 02:59, Alexander Motin wrote:
> On 20.08.2012 11:32, Doug Barton wrote:
>> On 08/15/2012 03:18, Alexander Motin wrote:
>>> On 15.08.2012 03:09, Doug Barton wrote:
>>>> On 08/14/2012 12:20 PM, Adrian Chadd wrote:
>>>>> Would you be willing to compile a kernel with KTR so you can capture
>>>>> some KTR scheduler dumps?
>>>>>
>>>>> That way the scheduler peeps can feed this into schedgraph.py (and you
>>>>> can too!) to figure out what's going on.
>>>>>
>>>>> Maybe things aren't being scheduled correctly and the added latency is
>>>>> killing performance?
>>>>
>>>> You might also try switching to SCHED_ULE to see if it helps.
>>>>
>>>> Although, in the last few months as mav has been converging the 2 I've
>>>> started to see the same problems I saw on my desktop systems previously
>>>> re-appear even using ULE. For example, if I'm watching an AVI with VLC
>>>> and start doing anything that generates a lot of interrupts (like
>>>> moving
>>>> large quantities of data from one disk to another) the video and sound
>>>> start to skip. Also, various other desktop features (like menus, window
>>>> switching, etc.) start to take measurable time to happen, sometimes
>>>> seconds.
>>>>
>>>> ... and lest you think this is just a desktop problem, I've seen the
>>>> same scenario on 8.x systems used as web servers. With ULE they were
>>>> frequently getting into peak load situations that created what I called
>>>> "mini thundering herd" problems where they could never quite get caught
>>>> up. Whereas switching to 4BSD the same servers got into high-load
>>>> situations less often, and they recovered on their own in minutes.
>>>
>>> It is quite pointless to speculate without real info like mentioned
>>> above KTR_SCHED traces.
>>
>> I'm sorry, you're quite wrong about that. In the cases I mentioned, and
>> in about 2 out of 3 of the cases where users reported problems and I
>> suggested that they try 4BSD, the results were clear. This obviously
>> points out that there is a serious problem with ULE, and if I were the
>> one who was responsible for that code I would be looking at ways of
>> helping users figure out where the problems are. But that's just me.
> 
> I am not telling anything bad about 4BSD.

Yes, I get that, but thanks for making it clear.

> Choice is provided because
> they are indeed different and none is perfect.

... which is why I'm asking you to stop making them more the same until
we get a better idea of what the issues are.

> What I would like to say is that if we want to improve situation, we
> need more detailed info then just verbal description.

And what I'm saying is that the only realistic way that you're going to
get that information that you need is to make it easier for users to
give it to you. I don't know what form that is going to need to take, I
don't know anything about schedulers.

> I am not telling
> that ULE is perfect. I went there because I've seen problems, and I am
> still fixing some pieces. I am just trying to explain described behavior
> from the point of my knowledge about it, hoping that it may help
> somebody to set up some new experiments or try some tuning/fixing.

Yes, I think it's great that you're doing this work. I'm glad to see
that someone is improving ULE. It clearly needs it. :)

Doug

-- 

    I am only one, but I am one.  I cannot do everything, but I can do
    something.  And I will not let what I cannot do interfere with what
    I can do.
			-- Edward Everett Hale, (1822 - 1909)
Received on Mon Aug 20 2012 - 08:25:38 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:29 UTC