Re: Improved multiprocessor usage on amd64

From: Dan Nelson <dnelson_at_allantgroup.com>
Date: Mon, 15 Sep 2008 23:11:43 -0500
In the last episode (Sep 15), Stephen Montgomery-Smith said:
> Stephen Montgomery-Smith wrote:
> > Steve Kargl wrote:
> >> On Mon, Sep 15, 2008 at 07:36:04PM -0500, Stephen Montgomery-Smith wrote:
> >>> ... and each thread is a loop of the form
> >>>
> >>> while (1) {
> >>>   wait until told to start;
> >>>   do massive amounts of floating point arithmetic (only additions and
> >>> multiplications) on large arrays;
> >>>   tell the master process that you are done;
> >>> }
> >>>
> >>>> Do you have about as many threads as processor or more?
> >>> Both ways.  The time difference between the two approaches is 
> >>> negligible.
> >>>
> >>
> >> Are you using ULE?  With my MPI applications, if the number of
> >> launched processes exceeds the number of cpus by 1, ULE falls
> >> through the floor.  I have a nagging feeling that there is a problem 
> >> with cpu affinity.
> >>
> >> http://lists.freebsd.org/pipermail/freebsd-current/2008-July/086917.html
> 
> Let me say a little bit more.
> 
> I have this gut feeling that the problem has a lot to do with cache 
> management.  My program has each thread doing, in effect, huge matrix 
> multiplications, each one working on their own little bit.  If a CPU 
> core changes from one thread to another, it then has to flush out the 
> cache to RAM, and read in a whole bunch of other RAM into cache.

You can try playing with the new cpuset functions in HEAD and 7-STABLE
to lock particular threads on certain CPUs.

-- 
	Dan Nelson
	dnelson_at_allantgroup.com
Received on Tue Sep 16 2008 - 02:32:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:35 UTC