What's New in Version 2.0
Important: Make sure you read through the documentation (especially the FAQs) before sending a question to the mailing lists.
Change Log
The change log for Nagios can be found online at http://www.nagios.org/changelog.php or in the Changelog file in the root directory of the source code distribution.
Known Issues
There is a known issue that can affect Nagios 2.0 on FreeBSD systems. Hopefully this problem can be fixed in a 2.x release...
- FreeBSD and threads. On FreeBSD there's a native user-level implementation of threads called 'pthread' and there's also an optional ports collection 'linuxthreads' that uses kernel hooks. Some folks from Yahoo! have reported that using the pthread library causes Nagios to pause under heavy I/O load, causing some service check results to be lost. Switching to linuxthreads seems to help this problem, but not fix it. The lock happens in liblthread's __pthread_acquire() - it can't ever acquire the spinlock. It happens when the main thread forks to execute an active check. On the second fork to create the grandchild, the grandchild is created by fork, but never returns from liblthread's fork wrapper, because it's stuck in __pthread_acquire(). Maybe some FreeBSD users can help out with this problem.
Changes and New Features
- Macro Changes - Macros have undergone a major overhaul. You will have to update most of your command definitions to match the new macros. Most macros are now available as environment variables. Also, "on-demand" host and service macros have been added. See the documentation on macros for more information.
- Hostgroup Changes
- Hostgroup escalations removed - Hostgroup escalations have been removed. Their functionality can be duplicated by using the hostgroup_name directive in hostgroup definitions.
- Member directive changes - Hostgroup definitions can now contain multiple members directives, which should make editing the config files easier when you have a lot of member hosts. Alternatively, you may use the hostgroups directive in host definitions to specify what hostgroup(s) a particular host is a member of.
- Contact group changes - The contact_groups directive has been moved from hostgroup definitions to host definitions. This was done in order to maintain consistency with the way service contacts are specified. Make sure to update your config files!
- Authorization changes - Authorization for access to hostgroups in the CGIs has been changed. You must now be authorized for all hosts that are members of the hostgroup in order to be authorized for the hostgroup.
- Host Changes
- Host freshness checking - Freshness checking has been added for host checks. This is controlled by the check_host_freshness option, along with the check_freshness directive in host definitions.
- OCHP Command - Host checks can now be obsessed over, just as services can be. The OCHP command is run for all hosts that have the obsess_over_host directive enabled in their host definition.
- Host Check Changes
- Regularly scheduled checks - You can now schedule regular checks of hosts by using the check_interval directive in host definitions. NOTE: Listen up! You should use regularly scheduled host checks rather sparingly. They are not necessary for normal operation (on-demand checks are already performed when necessary) and can negatively affect performance if used improperly. You've been warned.
- Passive host checks - Passive host checks are now supported if you've enabled them with the accept_passive_host_checks option in the main config file and the accept_passive_host_checks directive in the host definition. Passive host checks can make setting up redundant or distributed monitoring environments easier. NOTE: There are some problems with passive host checks that you should be aware of - read more about them here.
- Retention Changes
- Retention of scheduling information - Host and service check scheduling information (next check times) can now be retained across program restarts using the use_retained_scheduling_info directive.
- Smarter retention - Values of various host and service directives that can be retained across program restarts are now only retained if they are changed during runtime by an external command. This should make things less confusing to people when they try and modify host and service directive values and then restart Nagios, expecting to see some changes.
- More stuff retained - More information is now retained across program restarts, including flap detection history. Hoorah!
- Extended Info Changes
- New location - Extended host info and service info definitions are now stored in object config files along with host definitions, etc. As a result, extended info definitions are now parsed and validated by the Nagios daemon before startup.
- New directives - Extended host info and service info definitions now have two new directives: notes and action_url.
- Embedded Perl Changes
- p1.pl location - You can now specify the location of the embedded Perl "helper" file (p1.pl) using the p1_file directive.
- Notification Changes
- Flapping notifications - Notifications are now sent out when flapping starts and stops for hosts and services. This feature can be controlled using the f option in the notification_options for contacts, hosts and services.
- Better logic - Notification logic has been improved a bit. This should prevent recovery notifications getting sent out when no problem notification was sent out to begin with.
- Service notifications - Before service notifications are sent out, notification dependencies for the host are now checked. If host notifications are not deemed to be viable, notifications for the service will not be sent out either.
- Escalation options - Time period and state options have been added to host and service escalations. This gives you more control in determining when escalations can be used. More information on escalations can be found here.
- Service Groups Added - Service groups have now been added. They allow you to group services together for display purposes in the CGIs and can be referenced in service dependency and service escalation definitions to make configuration a bit easier.
- Triggered Downtime Added - Support for what's called "triggered" downtime has been added for host and service downtime. Triggered downtime allows you to define downtime that should start at the same time another downtime starts (very useful for scheduling downtime for child hosts when the parent host is scheduled for flexible downtime). More information on triggered downtime can be found here.
- New Stats Utility - A new utility called 'nagiostats' is now included in the Nagios distribution. Its a command-line utility that allows you to view current statistics for a running Nagios process. It can also produce data compatible with MRTG, so you can graph statistical information. More information on how to use the utility can be found here.
- Adaptive Monitoring - You can now change certain attributes relating to host and service checks (check command, check interval, max check attempts etc.) during runtime by submitting the appropriate external commands. This kind of adaptive monitoring will probably not be of much use to the majority of users out there, but it does provide a way for doing some neat stuff. More information on adaptive monitoring can be found here.
- Performance Data Changes - The methods for processing performance data have changed slightly. You can now process performance data by executing external commands and/or writing to files without recompiling Nagios. Read the documentation on performance data for more information.
- Native DB Support Dropped - Native support for storing various types of data (status, retention, comment, downtime, etc.) in MySQL and PostgreSQL has been dropped. Stop whining. I expect someone will develop an alternative using the new event broker sometime in the near future. Besides, DB support was not well implemented and dropping native DB support will make things easier for newbies to understand (one less thing to figure out).
- Event Broker API - An API has been created to allow individual developers to create addons that integrate with the core Nagios daemon. Documentation on the event broker API will be created as the 2.x code matures and will be available on the Nagios website.
- Misc Changes
- All commands can contain arguments - All command types (host checks, notifications, performance data processors, event handlers, etc.) can contain arguments (seperated from the command name by ! characters). Arguments are substituted in the command line using $ARGx macros.
- Config directory recursion - Nagios now recursively processes all config files found in subdirectories of the directories specified by the cfg_dir directive.
- Old config file support dropped - Support for older (non-template) style object and extended info config files has been dropped.
- Faster searches - Objects are now stored in a chained hash in order to speed searches. This should greatly improve the performance of the CGIs.
- Worker threads - A few worker threads have been added in order to artificially buffer data for the external command file and the internal pipe used to process service check results. This should substantially increase performance in larger setups.
- Logging changes - Initial host and service states are now logged a bit differently. Also, the initial states of all hosts and services are logged immediately after all log rotations. This should help with all those "undetermined time" problems in the availability and trends CGIs.
- Cached object config file - An object cache file is now created by Nagios at startup. It should help speed up the CGIs a bit and allow you to edit you object config files while Nagios is running without affecting the CGI output.
- Initial check limits - You can now specify timeframes in which the initial checks of all hosts and services should be performed after Nagios start. These timeframes are controlled by the max_host_check_spread and max_service_check_spread variables.
- "Sticky" acknowledgements - You can now designate host and service acknowledgements as being "sticky" or not. Sticky acknowledgements suppress notifications until a host or service fully recovers to an UP or OK state. Non-sticky acknowledgements only suppress notifications until a host or service changes state.
- Changed in checking clusters - The way you monitor service and host "clusters" has now changed and is more reliable than before. This is due to the incorporation of on-demand macros and a new plugin (check_cluster2). Read more about checking clusters here.
- Regular expression matching - Regular expression matching of various object directives can be enabled using the use_regexp_matching and use_true_regexp_matching variables. Information on how and where regular expression matching can be used can be found in the template tips and tricks documentation.
- Service pseudo-states - Support for some redundant service pseudo-states have been removed from the status CGI. This will affect any hardcoded URLs which use the servicestatustypes=X parameter for the CGI. Check include/statusdata.h for the new list of service states that you can use.
- Freshness check changes - The freshness check logic has been changed slightly. Freshness checks will not occur if the current time is not valid for the host or service check_timeperiod. Also, freshness checks will no longer occur if both the host or service check_interval and freshness_threshold variables are set to zero (0).