Jeff Roberson <jroberson_at_chesapeake.net> wrote: > On Fri, 31 Oct 2003, Bruno Van Den Bossche wrote: [...] > > I recently had to complete a little piece of software in a course on > > parallel computing. I've put it online[1] (we only had to write the > > pract2.cpp file). It calculates the inverse of a Vandermonde matrix and > > allows you to spawn multiple slave-processes who each perform a part of > > the work. Everything happens in memory so > > I've used it lately to test the different changes you made to > > sched_ule.c and these last fixes do improve the performance on my dual > > p3 machine a lot. > > > > Here are the results of my (very limited tests) : > > > > sched4bsd > > --- > > dimension slaves time > > 1000 1 90.925408 > > 1000 2 58.897038 > > > > 200 1 0.735962 > > 200 2 0.676660 > > > > sched_ule 1.68 > > --- > > dimension slaves time > > 1000 1 90.951015 > > 1000 2 70.402845 > > > > 200 1 0.743551 > > 200 2 1.900455 > > > > sched_ule 1.70 > > --- > > dimension slaves time > > 1000 1 90.782309 > > 1000 2 57.207351 > > > > 200 1 0.739998 > > 200 2 0.383545 > > > > > > I'm not really sure if this is very relevant to you, but from the > > end-user point of view (me :-)) this does means something. > > Thanks! > > I welcome the feedback, positive or negative, as it helps me improve > things. Thanks for the report! Could you run this again under 4bsd and > ULE with the following in your .cshrc: > > set time= ( 5 "%Uu %Ss %E %P %X+%Dk %I+%Oio %Fpf+%Ww %cc/%ww" ) > > And then time ./testpract 200 2, etc. This will give me a few hints about > what's impacting your performance. The program can run as a slave or master. So one should run one master and multiple slaves and they all work on a piece of shared memory. So I've timed the individual processes, as the wrapper-script test_pract2 doesn't do more then launch a few processes in the background. I don't think the output of that is very relevant. Here's the result: sched_4bsd 1.26 1000 1 master: 49.172u 0.187s 2:21.54 34.8% 15+10182k 0+0io 0pf+0w 5962c/65w slave : 90.326u 0.250s 1:30.75 99.8% 15+168k 0+0io 0pf+0w 9156c/35w 1000 2 master: 49.113u 0.226s 1:49.94 44.8% 15+10181k 0+0io 0pf+0w 5942c/63w slave1: 55.211u 0.326s 0:59.11 93.9% 15+166k 0+0io 0pf+0w 11129c/2224w slave2: 54.897u 0.363s 0:58.62 94.2% 15+167k 0+0io 0pf+0w 7111c/6129w 200 1 master: 0.377u 0.007s 0:02.39 15.4% 15+589k 0+0io 0pf+0w 38c/13w slave : 0.711u 0.031s 0:00.74 100.0% 15+169k 0+0io 0pf+0w 85c/1w 200 2 master: 0.376u 0.007s 0:02.87 12.8% 16+602k 0+0io 0pf+0w 41c/11w slave1: 0.388u 0.006s 0:01.03 36.8% 18+201k 0+0io 0pf+0w 1245c/408w slave2: 0.345u 0.038s 0:00.68 54.4% 34+158k 0+0io 0pf+0w 432c/1215w sched_ule 1.75 1000 1 master: 49.097u 0.163s 2:21.32 34.8% 15+10186k 0+0io 0pf+0w 6197c/163w slave : 90.157u 0.398s 1:30.82 99.6% 15+168k 0+0io 0pf+0w 11568c/49w 1000 2 master: 49.132u 0.164s 1:48.15 45.5% 15+10155k 0+0io 0pf+0w 6517c/276w slave1: 55.634u 0.406s 0:57.52 97.4% 15+169k 0+0io 0pf+0w 12745c/9628w slave2: 55.416u 0.391s 0:57.13 97.6% 15+168k 0+0io 0pf+0w 12448c/10063w 200 1 master: 0.369u 0.016s 0:02.52 14.6% 15+577k 0+0io 0pf+0w 92c/35w slave : 0.690u 0.054s 0:00.74 100.0% 15+171k 0+0io 0pf+0w 147c/13w 200 2 master: 0.376u 0.007s 0:02.47 14.9% 15+589k 0+0io 0pf+0w 87c/21w slave1: 0.331u 0.023s 0:00.70 50.0% 15+173k 0+0io 0pf+0w 466c/2135w slave2: 0.304u 0.040s 0:00.39 87.1% 15+166k 0+0io 0pf+0w 412c/2119w > > > > [1] <http://users.pandora.be/bomberboy/mptest/final.tar.bz2> > > It can be used by running testpract2 with two arguments, the dimension > > of the matrix and the number of slaves. example './testpract2 200 2' > > will create a matrix with dimension 200 and 2 slaves. -- Bruno This fortune is inoperative. Please try another.Received on Sun Nov 02 2003 - 13:52:14 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:27 UTC