Tim Kientzle wrote: > Scott Lambert wrote: > > I've posted this to FreeBSD-ports and Nagios-Users without a nibble. > > > > [New Thread 28326280 (LWP 100051)] > > [New Thread 28301140 (LWP 100222)] > > (gdb) bt > > #0 0x0807fe8b in get_next_comment_by_host () > > #1 0x08080940 in delete_host_acknowledgement_comments () > > #2 0x28331180 in ?? () > > #3 0x4aaac053 in ?? () > > #4 0x080cc394 in __JCR_LIST__ () > > Build with debug symbols and try again; maybe you can get > more detail. Also, check a couple of core dumps to > see if it's crashing in the same place; that might > also give a clue. > > Do the "New Thread" messages mean that Nagios is running > multiple threads? If so, I wonder what the other > thread is doing? > We've been trying to combat a performance issue in Nagios. One thread handles incoming events (nsca etc) and data from the nagios.cmd pipe file and writes files for processing in /var/spool/nagios/checkresults. The other thread processes these files and updates the host state and other data. The threading broke profiling (I think) because when Nagios was compiled with -pg it did no more than read its configuration, but this alone was a pointer to the area that Nagios is poorly optimised - string processing. Reading our configuration resulted in 65000000 calls to strcmp. 65 Million!! We're battling to keep up with passive events from about 5000 hosts every few minutes. The nagios.cmd thread struggles to keep up reading from the fifo when there a about 4000 writers. And the worker thread struggles parse the checkresults files - they're big, but not *that* big, maybe 80k to 120k lines which it takes about 7 minutes to parse. We also had to up kern.maxusers="1024" and kern.ipc.nmbclusters="131072" to prevent the system starving network resources. Ian -- Ian FreislichReceived on Thu Oct 08 2009 - 11:08:39 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:56 UTC