2006/11/11, Suleiman Souhlal <ssouhlal_at_freebsd.org>: > Attilio Rao wrote: > > 2006/10/18, Attilio Rao <attilio_at_freebsd.org>: > > > >> In my P4 branch: //depot/user/attilio/attilio_smpng/... > >> you can find a sx locks rewriting using the optimized semantic of > >> rwlocks; in the end this might result in a valuable performance > >> improvement. > > Excellent work. > Do you have any benchmark results? You might have a hard time convincing > someone to commit this if you don't show that it's worth it, especially > if you consider how much it complicates the code. I'm going to produce them but before I would really add the support for adaptive spinning. jhb_at_ has a patch [1] that might reduce it but we need more benchmarks. I'm trying to provide them too. > > After have received very positive stress-test feedbacks from pho_at_ > > (that I would like to thank for his patience and work) I went ahead > > and rather completed the implementation. Now this is ready to be > > reviewed and possibly committed into the CVS. > > Even if I plan a longer work on this branch (about syncronizing > > primitives), I think it is time for a revision of the work done until > > now from SMPng people (jhb, kmacy, etc.) and possibily an inclusion > > into HEAD (patch actually is 58k...). > > > > Some few hints about the patch: > > - LOCK_DEBUG adds a dependence between sx.h and lock.h (as for rwlocks > > and mutex) and a the new options SXLOCK_NOINLINE is added > > - XFS locking is still disabled in the patch (I hope to do it for > > tomorrow, I'm in GMT+1). > > - Possibly the sleepqueue interface modifies and new flags might be > > documented in the manpages (and NOTES file too, in order to reflect > > SXLOCK_NOINLINE inclusion). > > - It misses still of the adaptive spinning code, but it can be > > inserted after without problems. > > We might want to avoid adaptive spinning, since sx locks may be held for > quite long periods of time. I'm not sure I got it. Adaptive spinning really only acts if the exclusive holder is running on another CPU. BTW, it needs more evaluations. > > You can download the code directly from perforce > > (//depot/user/attilio/attilio_smpng/...) but patches are available > > here: > > http://users.gufi.org/~rookie/works/patches/smpng06112006.diff > > http://users.gufi.org/~rookie/works/patches/_sx.h > > > > I hope you will enjoy it (feedbacks, ideas, comments would be very > > appreciated). > > > > Attilio > > A few comments: > --- //depot/vendor/freebsd/src/sys/i386/include/param.h 2006/01/09 06:10:20 > +++ //depot/user/attilio/attilio_smpng/i386/include/param.h 2006/10/03 > 21:33:06 > _at__at_ -109,6 +109,15 _at__at_ > #endif > > /* > + * Define our own cache alignment mask for syncronizing primitives. > Pentium4 > + * and Xeon want a 128-byte wide aligned syncronizing primitive in order to > + * minimize cache bus traffic on CPUs cache lines movements. > + */ > +#ifndef SYNC_ALIGN > +#define SYNC_ALIGN (128 - 1) > +#endif > + > > Please also add this to amd64, and maybe rename it to UMA_ALIGN_SYNC. > On other architectures, have it be the same as UMA_ALIGN_CACHE. This > way, you don't have to define SYNC_ALIGN in every C file that uses it. > > While there, you might also want to make UMA_ALIGN_CACHE the correct > size for i386/amd64 (64 bytes, on most machines, i believe?). Ideally, > this would be a variable set at boot time, depending on what the CPU > reports. 32 for ia32, 64 for amd64, I guess. BTW, I think you are right, I will modify. > I also feel that the part that makes turnstiles and sleepqueues 128 byte > aligned should be broken up as as a separate patch/commit, as it doesn't > really have much to do with your sx rewriting, as I understand it. Yes, but this branch contains all the modifies I'm doing in order to improve our syncronizing primitives. I'm a little bit scared about patch dimensions... (actually is 60k and it necessarily needs to grow). > --- //depot/vendor/freebsd/src/sys/kern/kern_sx.c 2006/08/15 18:31:36 > +++ //depot/user/attilio/attilio_smpng/kern/kern_sx.c 2006/11/06 02:24:27 > _at__at_ -78,13 +50,8 _at__at_ > sx_init(struct sx *sx, const char *description) > { > > - sx->sx_lock = mtx_pool_find(mtxpool_lockbuilder, sx); > - sx->sx_cnt = 0; > - cv_init(&sx->sx_shrd_cv, description); > - sx->sx_shrd_wcnt = 0; > - cv_init(&sx->sx_excl_cv, description); > - sx->sx_excl_wcnt = 0; > - sx->sx_xholder = NULL; > + sx->sx_lock = SX_UNHELD; > + sx->sx_desc = "sx lock"; > lock_init(&sx->sx_object, &lock_class_sx, description, NULL, > LO_WITNESS | LO_RECURSABLE | LO_SLEEPABLE | LO_UPGRADABLE); > > Shouldn't it be sx->sx_desc = description; ? right, thanks. > Keep up the good work! I hope to see some benchmark results and to see > it committed soon! :-) Thanks a lot for your feedbacks, Attilio -- Peace can only be achieved by understanding - A. EinsteinReceived on Sat Nov 11 2006 - 12:32:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:02 UTC