Re: CURRENT as gateway on not-so-fast hardware: where is a bottlneck?

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Mon, 20 Aug 2012 12:59:58 +0300
On 20.08.2012 11:32, Doug Barton wrote:
> On 08/15/2012 03:18, Alexander Motin wrote:
>> On 15.08.2012 03:09, Doug Barton wrote:
>>> On 08/14/2012 12:20 PM, Adrian Chadd wrote:
>>>> Would you be willing to compile a kernel with KTR so you can capture
>>>> some KTR scheduler dumps?
>>>>
>>>> That way the scheduler peeps can feed this into schedgraph.py (and you
>>>> can too!) to figure out what's going on.
>>>>
>>>> Maybe things aren't being scheduled correctly and the added latency is
>>>> killing performance?
>>>
>>> You might also try switching to SCHED_ULE to see if it helps.
>>>
>>> Although, in the last few months as mav has been converging the 2 I've
>>> started to see the same problems I saw on my desktop systems previously
>>> re-appear even using ULE. For example, if I'm watching an AVI with VLC
>>> and start doing anything that generates a lot of interrupts (like moving
>>> large quantities of data from one disk to another) the video and sound
>>> start to skip. Also, various other desktop features (like menus, window
>>> switching, etc.) start to take measurable time to happen, sometimes
>>> seconds.
>>>
>>> ... and lest you think this is just a desktop problem, I've seen the
>>> same scenario on 8.x systems used as web servers. With ULE they were
>>> frequently getting into peak load situations that created what I called
>>> "mini thundering herd" problems where they could never quite get caught
>>> up. Whereas switching to 4BSD the same servers got into high-load
>>> situations less often, and they recovered on their own in minutes.
>>
>> It is quite pointless to speculate without real info like mentioned
>> above KTR_SCHED traces.
>
> I'm sorry, you're quite wrong about that. In the cases I mentioned, and
> in about 2 out of 3 of the cases where users reported problems and I
> suggested that they try 4BSD, the results were clear. This obviously
> points out that there is a serious problem with ULE, and if I were the
> one who was responsible for that code I would be looking at ways of
> helping users figure out where the problems are. But that's just me.

I am not telling anything bad about 4BSD. Choice is provided because 
they are indeed different and none is perfect. 4BSD also has problems. 
What I would like to say is that if we want to improve situation, we 
need more detailed info then just verbal description. I am not telling 
that ULE is perfect. I went there because I've seen problems, and I am 
still fixing some pieces. I am just trying to explain described behavior 
from the point of my knowledge about it, hoping that it may help 
somebody to set up some new experiments or try some tuning/fixing.

-- 
Alexander Motin
Received on Mon Aug 20 2012 - 08:00:04 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:29 UTC