Re: ntpd dies nightly on a server with jails

From: Cy Schubert <Cy.Schubert_at_komquats.com>
Date: Fri, 17 Mar 2017 20:52:12 -0700
In message <1489782793.40576.185.camel_at_freebsd.org>, Ian Lepore writes:
> On Fri, 2017-03-17 at 13:26 -0700, Don Lewis wrote:
> > On 17 Mar, O. Hartmann wrote:
> > 
> > > 
> > > Just some strange news:
> > > 
> > > I left the server the whole day with ntpd disabled and I didn't
> > > watch
> > > a gain of the RTC by one second, even stressing the machine.
> > > 
> > > But soon after restarting ntpd, I realised immediately a 30 minutes
> > > off! This morning, the discrapancy was almost 5 hours - it looked
> > > more
> > > like a weird ajustment to another time base than UTC.
> > > 
> > > Over the weekend I'll leave the server with ntpd disabled and only
> > > RTC
> > > running. I've the strange feeling that something is intentionally
> > > readjusting the ntpd time due to a misconfiguration or a rogue ntp
> > > server in the X.CC.pool.ntp.org
> > A ntp should recognize a single bad server and ignore it in favor of 
> > the other servers that are sane.
> > 
> > It sounds like something is going off the rails once ntpd starts
> > calling
> > adjtime().  What is the output of:
> > 	sysctl kern.clockrate
> > 
> > I'd suggest starting ntpd and running "ntpq -c pe" a few times a
> > minute
> > and capturing its output to monitor the status of ntpd as it starts
> > up
> > and try to capture things going wrong.   You should probably disable
> > iburst in ntp.conf to give more visibility in the early startup.
> > 
> > For the first few minutes ntpd should just be getting reliable
> > timestamp
> > info and won't start trying to adjust the clock until it has captured
> > endough samples and figured out which servers are best.  Then the
> > behaviour of the offset is the thing to watch.  If the iniital offset
> > is
> > large enough, ntpd will step the clock once to get it close to zero,
> > otherwise it will just use adjtime to slowy push the offset towards
> > zero.  I think though that you will see the offset start gyrating
> > madly.
> > 
> > You might want to set /var/db/ntpd.drift to zero beforehand if there
> > is
> > an insane value in there.  If the initial drift value is bogus, will
> > try
> > to use it which will push the time offset away from zero so fast that
> > it
> > will decide to keep stepping the clock back to zero before it can
> > capture enough samples from the external servers to determine the
> > true
> > local clock drift rate.
> 
> Do not set ntpd.drift contents to zero.  Delete the file.  There's a
> huge difference between a file that says the clock is perfect and a
> missing file which triggers ntpd to do a 15-minute frequency
> measurement to come up with the initial drift correction.

Yes. And, without debugging output and/or a dump, I don't think we'll be 
any closer to the truth. Until then the best we can do is make educated 
guesses.


-- 
Cheers,
Cy Schubert <Cy.Schubert_at_cschubert.com>
FreeBSD UNIX:  <cy_at_FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.
Received on Sat Mar 18 2017 - 02:52:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC