Re: TSC Timecounter and multi-core/SMP

From: David O'Brien <obrien_at_freebsd.org> Date: Thu, 24 Apr 2008 09:47:34 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:30 UTC

On Fri, Apr 18, 2008 at 10:54:53AM -0700, Julian Elischer wrote:
> Poul-Henning Kamp wrote:
>> In message <48080276.3040203_at_elischer.org>, Julian Elischer writes:
>>> You'd think that an invariant sync'd clock (fast to read) of some
>>> type would have been done by someone by now.. The software people
>>> have been asking for this for the last decade at least.
>
>> Actually one of the original design documents for SAGE stressed that
>> such hardware were crucially important "for any system operating
>> in real time", so yes, the HW people have had adequate notices.
>> Poul-Henning
> 
> I'm certain that earlier systems had it as a requirement but I wasn't
> willing to lump the IBM 407 or 1620  in to the same bucket as an SMP
> PC with the ability to change the frequency on each CPU. I remember
> that the MP vaxen and PDPs had good timers.. and I'm certain the MP
> IBMs did too.

You're also speaking of a world where the HW vendor controlled the OS and
could change the OS at any time to match changes in the HW.  That has not
been true of the x86 world for nearly three decades.

> How hard can it be?

Quite hard - especially if you want it fast (on the order of say 10
cycles like TSC is).

> An instruction that gives a 64 bit counter, in some reasonable
> granularity that is run at the same speed for all CPUS in a system
> regardless of the speed each cpu is running..

AMD Greyhound (Family 10h) gives this (well, at 60'ish cycles).  What you
didn't ask for here (but I think you did in another email) is for all the
values to be the same.  That means off-chip signaling.  The HPET was
one attempt at addressing this, but it centralized the counter and thus
the reads of it are serialized and >100 cycles.

It is my understanding, some Sparc systems have the time source you are
looking for.

What you're really asking for is for wall-time to be split out of
processor cycle counter (which is what TSC really is).  That OS's
have been abusing TSC for a wall-time source is the fault.

x86 processor companies would like to make this change (TSC and clock
time source), but is the ways heard "there way too much software written
that uses X to change".

> While nsecs would be nice even usecs might do.

Nope, not really.  Every OS I know of that has tried to use the HPET or
ACPI PMtimer instead of TSC cannot stand the latency and thus fights for
ways to go back to the TSC.  [FreeBSD is mostly an exception, but the
fact we're having this discussion... say we're not totally satisfied
with ACPI PMtimer or HPET.]

> They don't even have to be in sync as long as the offset
> between them is constant (though that would be nice).

This is what AMD's Greyhound (family 10h) *finally* gives [AMD users].
I think Intel Core2 does too - but at a price of lower granularity.

> hardware people don't seem to realise the importance
> of this. and keep throwing it out to gain/save a pin or to save
> some transistors for some other feature.

NO.  The HW people *DO* understand this.  It has nothing to do with
saving a pin or some transistors.  Multi-core and 6GB cache is here
today because we have an abundance of transistors.  AMD Opteron's now
have 1207 pins.

A large set of folks responsible for lack of change are SW folks who
would need to change on a dime (and go back and change older SW too)
that gets in the way.  Would we be willing to go change time sources
for FreeBSD 6.4?  Would Microsoft for w2k3?  Or Red Hat for RHEL4?
(HW vendors want to sell product, which means to customers using
*released* software; not requiring a 5 year out OS to take advantage
of it.)

-- 
-- David  (obrien_at_FreeBSD.org)