Re: 4.7 vs 5.2.1 SMP/UP bridging performance

From: Scott Long <scottl_at_freebsd.org>
Date: Fri, 07 May 2004 09:02:12 -0600
Robert Watson wrote:
> On Fri, 7 May 2004, Brad Knowles wrote:
> 
> 
>>At 10:55 PM -0400 2004/05/06, Robert Watson wrote:
>>
>>
>>> On occasion, I've had conversations with Peter Wemm about providing HAL
>>> modules with optimized versions of various common routines for specific
>>> hardware platforms.  However, that would require us to make a trade-off
>>> between the performance benefits of inlining and the performance benefits
>>> of a HAL module...
>>
>>	I'm confused.  Couldn't you just do this sort of stuff as
>>conditional macros, which would have both benefits? 
> 
> 
> Well, the goal of introducing HAL modules would be that you don't have to
> recompile the kernel in order to perform local hardware-specific
> optimization of low level routines.  I.e., you could substitute faster
> implementations of zeroing, synchronization, certain math routines, etc
> based on the CPU discovered at run-time.  While you can have switch
> statements, etc, it's faster just to relink the kernel to use the better
> implementation for the available CPU.  However, if you do that, you still
> end up with the function call cost, which might well out-weight the
> benefits of specialization.
> 
> Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
> robert_at_fledge.watson.org      Senior Research Scientist, McAfee Research
> 

It really depends on how you link the HAL module in.  Calling indirectly
through function pointers is pretty darn slow, and I suspect that the
long pipeline of a P4 makes this even worse.  Switching to a better
instruction might save you 20 cycles, but the indirect call to do it
might cost you 30 and that assumes that the branched instruction stream
is still in the L1 cache and that twiddling %esp and %ebp gives no 
pipeline stalls themselves.  Even without the indirect jump, all of the
housekeeping that goes into making a function call might drown out most
benefits.  The only way that this might make sense is if you move the
abstraction upwards and have it encompass more common code, or do some
sort of self-modifying code scheme early in the boot.  The alternative
might be to have the HAL be a compile-time option like Brad hinted at.

Scott
Received on Fri May 07 2004 - 06:02:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:53 UTC