On Wed, 2018-06-27 at 10:05 -0700, Rodney W. Grimes wrote: > > > > On Wed, Jun 27, 2018 at 10:36 AM, Jung-uk Kim <jkim_at_freebsd.org> > > wrote: > > > > > > > > On 06/27/2018 03:14, Andriy Gapon wrote: > > > > > > > > > > > > It seems that TSC calibration in virtual machines sometimes can > > > > do more > > > harm > > > > > > > > than good. Should we default to trusting the information > > > > provided by a > > > hypervisor? > > > > > > > > > > > > Specifically, I am observing a problem on GCE instances where > > > > calibrated > > > TSC > > > > > > > > frequency is about 10% lower than advertised frequency. And > > > > apparently > > > the > > > > > > > > advertised frequency is the right one. > > > > > > > > I found this thread with similar reports and a variety of > > > > workarounds > > > from > > > > > > > > administratively disabling the calibration to switching to a > > > > different > > > timecounter: > > > > > > > > https://lists.freebsd.org/pipermail/freebsd-cloud/2017- > > > January/000080.html > > > > > > We already do that for VMware hosts since r221214. > > > > > > https://svnweb.freebsd.org/changeset/base/221214 > > > > > > We should do the same for each hypervisor. > > > > > > Jung-uk Kim > > > > > > > > We probably should. But why does calibration fail in the first > > place? If > > it can fail in a VM, then it can probably fail on bare metal > > too. It would > > be worth investigating. > No, the failure in a VM is unique to a VM, it has to do with the fact > your have the hypervisor timeslicing a CPU that you believe to be > 100% > dedicated to you. > > There are several white papers, including one from VMWare about what > they have done to help with the time keeping problems. > > What is suggested above would be a correct thing to do. > Bhyve creates these issues as well, and use of certain timers > in a bhyve guest can cause you nightmares with ntp. Iirc, bhyve's arithmetic when doing timer emulation leads to roundoff errors that accumulate to effectively make the emulated timer run off- frequency. The hpet timer was trivial to fix by just redefining it to run at a power-of-2 frequency to eliminate rounding errors. The other timers have to run at fixed frequencies, so better arithmetic will be the way to fix them. I vaguely remember that being harder to do than to say because of the way the code is currently structured, which is why I just did the easy fix to the hpet so that people would have at least one usable timer that didn't give ntpd fits in guest OSes. -- IanReceived on Wed Jun 27 2018 - 16:49:31 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC