Nagios SIGSEGV on FreeBSD 8

From: Scott Lambert <lambert_at_lambertfam.org>
Date: Tue, 22 Sep 2009 16:29:05 -0500
I've posted this to FreeBSD-ports and Nagios-Users without a nibble.  

I've been running a FreeBSD 8-BETA2 server for DNS on a network I
recently took over.  No problems.  We needed to get Nagios running on
that network to watch all the hosts in RFC 1918 space.  Taking the easy
route, I just installed the Nagios 3.0.6 port on this 8-BETA2 box.

Nagios runs great until an acknowledged down host (with acknowledgment
comment) comes back up.  Nagios exits on a SIGSEGV.  It seems to only
happen when we have retention data (retention.dat) showing the host
down.  If we just restart Nagios without removing the retention.dat
file, it exists on SIGSEGV the next time it tries to mark the host up.  I
upgraded to the nagios-devel (Nagios 3.1.2) port and we have the same
problem.

If the host comes back up before Nagios writes the retention data file
showing the host down, we don't seem to see the problem.  The retention
data file keep state when Nagios is restarted.  There is the option of
checkpointing the file so that if Nagios does not get to write the new
version on exit, the data in the file won't be weeks old when Nagios is
restarted.  We are using the checkpoint option with one hour intervals.

Have there been threading, or other, changes in 8 vs 7 which could cause
something like this?  Unfortunately, I don't have ANY other *nix boxes
on this network.  My other Nagios box is still running FreeBSD 4 and
Nagios 2.  That combination has been dead reliable.

This is all I have figured out with gdb:

sudo gdb -c /var/coredumps/nagios-52050.core /usr/local/bin/nagios
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...(no debugging symbols found)...
Core was generated by `nagios'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x0807fe8b in get_next_comment_by_host ()
[New Thread 28326280 (LWP 100051)]
[New Thread 28301140 (LWP 100222)]
(gdb) bt
#0  0x0807fe8b in get_next_comment_by_host ()
#1  0x08080940 in delete_host_acknowledgement_comments ()
#2  0x28331180 in ?? ()
#3  0x4aaac053 in ?? ()
#4  0x080cc394 in __JCR_LIST__ ()
#5  0x28342f00 in ?? ()
#6  0x00000000 in ?? ()
#7  0xbfbfe858 in ?? ()
#8  0x08071c15 in handle_host_state ()
Previous frame inner to this frame (corrupt stack?)

Here is the code for get_next_comment_by_host:

comment *get_next_comment_by_host(char *host_name, comment *start){
	comment *temp_comment=NULL;

	if(host_name==NULL || comment_hashlist==NULL)
		return NULL;

	if(start==NULL)
		temp_comment=comment_hashlist[hashfunc(host_name,NULL,COMMENT_HASHSLOTS)];
	else
		temp_comment=start->nexthash;

	for(;temp_comment && compare_hashdata(temp_comment->host_name,NULL,host_name,NULL)<0;temp_comment=temp_comment->nexthash);

	if(temp_comment && compare_hashdata(temp_comment->host_name,NULL,host_name,NULL)==0)
		return temp_comment;

	return NULL;
	}

I haven't found any reports of similar issues on the Nagios list or
elsewhere on Google.  

I may just be the first sucker to try to run Nagios on FreeBSD 8. :-)

Thanks,

-- 
Scott Lambert                    KC5MLE                       Unix SysAdmin
lambert_at_lambertfam.org
Received on Tue Sep 22 2009 - 19:39:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:55 UTC