State Types


Introduction

The current state of services and hosts is determined by two components: the status of the service or host (i.e. OK, WARNING, UP, DOWN, etc.) and the type of state it is in. There are two state types in Nagios - "soft" states and "hard" states. State types are a crucial part of Nagios' monitoring logic. They are used to determine when event handlers are executed and when notifications are sent out.

Service and Host Check Retries

In order to prevent false alarms, Nagios allows you to define how many times a service or host check will be retried before the service or host is considered to have a real problem. The maximum number of retries before a service or host check is considered to have a real problem is controlled by the <max_check)attempts> option in the service and host definitions, respectively. Depending on what attempt a service or host check is currently on determines what type of state it is is. There are a few exceptions to this in the service monitoring logic, but we'll ignore those for now. Let's take a look at the different service state types...

Soft States

Soft states occur for services and hosts in the following situations...

Soft State Events

What happens when a service or host is in a soft error state or experiences a soft recovery?

As can be seen, the only important thing that really happens during a soft state is the execution of event handlers. Using event handlers can be particularly useful if you want to try and proactively fix a problem before it turns into a hard state. More information on event handlers can be found here.

Hard States

Hard states occur for services in the following situations (hard host states are discussed later)...

Hard states occur for hosts in the following situations...

Hard State Changes

Before I discuss what happens when a host or service is in a hard state, you need to know about hard state changes. Hard state changes occur when a service or host...

Hard State Events

What happens when a service or host is in a hard error state or experiences a hard recovery? Well, that depends on whether or not a hard state change (as described above) has occurred.

If a hard state change has occurred and the service or host is in a non-OK state the following things will occur..

If a hard state change has occurred and the service or host is in an OK state the following things will occur..

If a hard state change has NOT occurred and the service or host is in a non-OK state the following things will occur..

If a hard state change has NOT occurred and the service or host is in an OK state nothing happens. This is because the service or host is in an OK state and was the last time it was checked as well.