Re: Improving the kernel/i386 timecounter performance (GSoC proposal)

From: Astrodog <astrodog_at_gmail.com>
Date: Sat, 28 Mar 2009 07:46:43 -0500
On Fri, Mar 27, 2009 at 7:25 PM, Jason Evans <jasone_at_freebsd.org> wrote:
> Robert Watson wrote:
>>
>> On Fri, 27 Mar 2009, Poul-Henning Kamp wrote:
>>>
>>> In message <alpine.BSF.2.00.0903272303040.12518_at_fledge.watson.org>,
>>> Robert Wats on writes:
>>>
>>>> In which case user application threads will need to know their CPU [...]
>>>
>>> Didn't jemalloc solve that problem once already ?
>>
>> I think jemalloc implements thread-affinity for arenas rather than
>> CPU-affinity in the strict sense, but I may misread.
>
> CPU affinity is of limited use to malloc unless it can safely pin threads to
> CPUs.  Unfortunately, malloc cannot muck with CPU affinity, since that's up
> to the application.  Therefore, as you say, jemalloc implements (dynamically
> balanced) arena affinity.
>
> It might work okay in practice to use the current CPU ID to decide which
> arena to use, if the scheduler does not often migrate running processes.  I
> haven't explored that possibility though, since the infrastructure for
> cheaply querying the CPU ID doesn't currently (to my knowledge) exist.
>
> Jason

Hopefully, this is a more reasonable CC list, yet will still get to everyone...

First, re: scottl's creating pages on fork, I might be able to do
that. I'll give it a shot when I get back to my machine at home, and
let you know if it either works, or just blows up in my face, and
causes the usual brain melt I get when I poke at VM stuff.

As far as thread CPU affinity goes, as I understand things, this is
implemented in sched_ule, and one could certainly make a version of
malloc that takes advantage of this... however, locking a thread to a
CPU has some pretty significant side effects. Even if you only lock
running/runnable threads to a CPU, you could end up with some horribly
unbalanced scheduling, depending entirely on the load of the machine
when the threads are started, and pinned, that the scheduler cannot
balance, even on a system with moderate load, which would probably
hurt performance more than most things one could do trying to get an
accurate, fast timer. If nothing else, it'd be a nightmare on machines
with intermittent high load, and it'd produce fairly inconsistent
performance.

For cheaply getting the current CPUID, if there's actual demand for
this information in userland applications, it should be fairly easy to
add to the scheduler, assuming it doesn't already exist. If JeffR, etc
doesn't have time, let me know and I'll crank out a patch.

--- Harrison Grundy
Received on Sat Mar 28 2009 - 11:46:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:45 UTC