Re: prefer tsc timecounter when it's good

From: Jung-uk Kim <jkim_at_FreeBSD.org>
Date: Wed, 27 Apr 2011 15:10:45 -0400
On Wednesday 27 April 2011 03:07 am, Andriy Gapon wrote:
> on 26/04/2011 20:49 Jung-uk Kim said the following:
> > Can you please test attached patch?  You can get it from here,
> > too:
> >
> > http://people.freebsd.org/~jkim/tsc_smp_test.diff
>
> I am planning on testing the patch, but I am a little bit busy with
> other things at the moment.

No problem.

> The idea looks good to me.  The code is a little bit hard to follow
> :-)

Sorry.  The patch was updated to make it little easier to read:

http://people.freebsd.org/~jkim/tsc_smp_test2.diff

> I would use three separate array instead of a single array with 
> triple size (for clarity).  The arrays would have to be placed
> inside a structure for passing to smp_rendezvous.

Array of pointers...  I tried that earlier but it made the code little 
bit more complex overall, actually. :-/

> Also, perhaps a single rendezvous per iteration would be
> sufficient, so that you get four values to compare per a pair of
> CPUs, instead of current six.  Again to make the code simpler /
> more readable.  That would allow to expand TSC_READ macro as well
> (two copies of the function would take less lines than the macro).

smp_rendezvous() is very expensive, so I wanted to get more samples 
per call.

> BTW, not sure if you actually need 'volatile' inside tsc_read_X.

No, I don't, removed.  FYI, it was part of my early experiments to 
find the cheapest way to work around cache issues, which never 
worked.  Eventually, I ended up with calling smp_cache_flush(). :-(

> > Currently this patch samples 3,000 times to determine if any CPU
> > has out-of-order TSC but it may be too much, especially for large
> > SMP machines.  If it takes too long, I'll lower the number. 
> > Please report if that's the case.
> >
> > Please note this patch also changes HPET, ACPI-fast and TSC
> > qualities. However, TSC on SMP does not change the default
> > quality, i.e., HPET or ACPI timer will be chosen by default,
> > because we cannot be sure if they'll drift later.  If the user is
> > sure that they don't drift AND it is absolutely constant,
> > kern.timecounter.smp_tsc tunable can be used to set better
> > quality.
>
> - You changing the relative priorities of HPET and ACPI-fast.  I
> support this change (some others may not), but please make it as a
> separate commit.

Yes, that is my intention but it's continuation of your original 
patch. :-)

> - Not sure if the quality test code is of much use if a user has to
> set some tunable to actually use it over HPET or ACPI-fast.  I
> thought that the whole point was in automatically choosing the best
> timecounter.

It can be extended to make better selection depending on vendor, 
model, topology, etc. if necessary.

> I would go the opposite way - if automatic selection of TSC causes
> any trouble then provide a way to disable it.

POLA...

Jung-uk Kim
Received on Wed Apr 27 2011 - 17:11:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC