Re: NIS exhausts system resources

From: Dan Pelleg <daniel+bsd_at_pelleg.org>
Date: 08 Apr 2003 21:35:19 -0400
Terry Lambert <tlambert2_at_mindspring.com> writes:

> Dan Pelleg wrote:
> > This sounds plausible, thanks.
> > 
> > However, In the past, when configuring NIS for the first time, I did see
> > the machine go unresponsive on me as soon as I did a "domainname foo". This
> > was probably at a time when the server wasn't set up correctly. But why
> > would that immediately cause high resource usage? There shouldn't be any
> > name or user lookups taking place. This is a very quiet machine - I don't
> > think it was running more than very lightly loaded postfix and sshd at the
> > time, and I was the only user logging in. Name lookups, by the way, should
> > not go out anyway - the machine is running its own named.
> 
> On the contrary; you have to know how "domainname foo" will affect
> the system operation.
> 
> The domainanme(1) command calls the setdomainname(3) library function,
> which calls that function out of /usr/src/lib/libc/gen/setdomainname.c;
> This in turn sysctl's it down to the MIB entry "kern.domainname".
> 
> None of this causes ypbind to rebind to the new domain name.
> 
> As a result, all future requests return an error, because it is
> asking for data which the server does not have, and which the
> client, being in the wrong NIS domain, is not permitted to request.
> 
> If you are going to use domainname(1), use it as part of an rc
> script, before ypbind is tarted.
> 
> Alternately, kill ypbind, use domainname(1), and then restart
> ypbind.
> 
> Minimally, kick ypbind in the head with a HUP; if you have
> specified a domainname on the command line when starting ypbind
> (e.g. using "-S"), then you will have to kill and restart it.
> 

Ok, I re-ran the test for more conclusive results. I now do:
 - domainname foo (no server for foo exists)
 - /etc/rc.d/ypbind restart
 - ypcat passwd

It will quickly make the system unresponsive. top(1) shows lots of rpcbind
processes, in the kqread and nanslp states. Shortly afterwards I'm running
out of swap (along with the other symptoms I reported).

I can reproduce this, with maxfiles at either 1084 (the default) or 10000
set at the bootloader.

As another datapoint, a similarly configured machine, with a 500Mhz
processor and 384MB or 128MB of memory passes the test above without any
ill effects. I don't have bigger memory sticks for the "small" machine, nor
do I have smaller ones for the 500Mhz one. Blame Moore's law.

As I said, setting domainname to some unserved "foo" in rc.conf
will also show similar behaviour, and the system will not boot
(either that or my patience runs out before resetting it).

> It is not in the /usr/src/sys/i386/conf/ directory, it's in the
> global one in /usr/src/sys/conf/.
> 
> Yes, the default for MAXUSERS is 0, which is supposed to mean
> "autotune everything correctly".
> 
> The default ends up setting this value, a tunable int, to 32.  See
> the sources in: /usr/src/sys/kern/subr_param.c.
> 
> A better number to raise might be MAXFILES; it's set based on the
> number of physical pages in the system, divided by 12.
> 
> On your system, it's proably 1800 or so; to find out, type:
> 
> 	 sysctl kern.maxfiles
> 
> Note that this thing pretends you can set it via a sysctl; you
> *CANNOT*; there are statically sized tables that do not grow,
> and your number of network, etc., connections will not be increased
> by you setting this via sysctl -- SO DON'T DO THAT.
> 
> If you know how to use the boot loader rc file, then you can set
> "kern.maxfiles=10000" in the loader.  If you don't, then use
> "options MAXFILES=10000" and recompile your kernel, instead.
> 
> The reason I suggested MAXUSERS first is that there are other
> things that it tunes, and you might bump your head on those
> things, too, later (i.e. preventative maintenance).
> 
> -- Terry

I'm ready to dismiss this as a bug that's only triggered on slow and
small-memory systems. I'll dig around for tuneables that will still let me
survive it. Seeing as kern.maxfiles=10000 doesn't work, what would you
suggest for MAXUSERS for a machine with 64MB or RAM?

-- 

  Dan Pelleg
Received on Tue Apr 08 2003 - 16:35:47 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:03 UTC