Stephen Montgomery-Smith wrote: > Steve Kargl wrote: >> On Mon, Sep 15, 2008 at 07:36:04PM -0500, Stephen Montgomery-Smith wrote: >>> ... and each thread is a loop of the form >>> >>> while (1) { >>> wait until told to start; >>> do massive amounts of floating point arithmetic (only additions and >>> multiplications) on large arrays; >>> tell the master process that you are done; >>> } >>> >>>> Do you have about as many threads as processor or more? >>> Both ways. The time difference between the two approaches is >>> negligible. >>> >> >> Are you using ULE? With my MPI applications, if the number of >> launched processes exceeds the number of cpus by 1, ULE falls >> through the floor. I have a nagging feeling that there is a problem >> with cpu affinity. >> >> http://lists.freebsd.org/pipermail/freebsd-current/2008-July/086917.html >> Let me say a little bit more. I have this gut feeling that the problem has a lot to do with cache management. My program has each thread doing, in effect, huge matrix multiplications, each one working on their own little bit. If a CPU core changes from one thread to another, it then has to flush out the cache to RAM, and read in a whole bunch of other RAM into cache. I have this sense that Linux and FreeBSD have something in its internals where it figures this out, and after a while starts changing the time between when it changes from one process to another. But Linux has a faster learning curve than FreeBSD. But this is all pure speculation on my part, because I have very little ideas as to how these internals work. StephenReceived on Tue Sep 16 2008 - 01:48:56 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:35 UTC