Re: ntpd segfaults on start

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Mon, 9 Sep 2019 21:44:46 +0300
On Mon, Sep 09, 2019 at 12:13:24PM -0600, Ian Lepore wrote:
> On Mon, 2019-09-09 at 09:30 -0700, Rodney W. Grimes wrote:
> > > On Sat, 2019-09-07 at 09:28 -0700, Cy Schubert wrote:
> > > > In message <20190907161749.GJ2559_at_kib.kiev.ua>, Konstantin
> > > > Belousov writes:
> > > > > On Sat, Sep 07, 2019 at 08:45:21AM -0700, Cy Schubert wrote:
> > > > > > I've been able to set the memlock rlimit as low as 20 MB. The
> > > > > > issue is 
> > > > > > letting it default to 0 which allows ntp to mlockall()
> > > > > > anything it wants. 
> > > > > > ntpd on my sandbox is currently using 18 MB.
> > > > > 
> > > > > Default stack size on amd64 is 512M, and default stack gap
> > > > > percentage is
> > > > > 3%. This means that the gap can be as large as ~17MB. If 3MB is
> > > > > enough
> > > > > for the stack of the main thread of ntpd, then fine.
> > > > 
> > > > The default stack is 200K, which is also tuneable in ntp.conf.
> > > > 
> > > > [...]
> > > 
> > > I haven't seen anyone ask what I consider to be the crucial
> > > question
> > > yet:  why are we locking ntpd into memory by default at all?
> > > 
> > > I have seen two rationales for ntpd using mlockall() and
> > > setrlimit(): 
> > > 
> > >    - There are claims that it improves timing performance.
> > > 
> > >    - Because ntpd is a daemon that can run for months at a time,
> > >    setting limits on memory and stack growth can help detect and
> > >    mitigate against memory leak problems in the daemon.
> > 
> > Doesn't locking this memory down also protect ntpd from OOM kills?
> > If so that is a MUST preserve functionality, as IMHO killing ntpd
> > on a box that has it configured is a total no win situation.
> > 
> 
> Does it have that effect?  I don't know.  But I would argue that that's
> a separate issue, and we should make that happen by adding
> ntpd_oomprotect=YES to /etc/defaults/rc.conf
Wiring process memory has no effect on OOM selection. More, because
all potentially allocated pages are allocated for real after mlockall(),
the size of the vmspace, as accounted by OOM, is the largest possible
size from the whole lifetime.

On the other hand, the code execution times are not predictable if the
process's pages can be paged out. Under severe load next instruction
might take several seconds or even minutes to start. It is quite unlike
the scheduler delays. That introduces a jitter in the local time
measurements and their usage as done in userspace. Wouldn't this affect
the accuracy ?

> 
> Right now only syslogd has oomprotect set to YES by default.  Maybe
> that's a good choice -- once we start declaring one daemon to be more
> important than others, you'll discover there's a whole back lot full of
> bikesheds that need painting.
> 
> So maybe we should just document ntpd_oomprotect=YES in some more-
> prominent way.  If we were to add a comment block to ntp.conf
> describing rlimit, that might be a good place to mention setting
> ntpd_oomprotect in rc.conf.
> 
> -- Ian
> 
Received on Mon Sep 09 2019 - 16:45:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC