How do notifications work?

About

In the OP5 Monitor user manual we describe some of the basics with notifications. Let us take a closer look at how it really works.

When do notifications occur and who gets notified?

The decision to send out notifications is made in the service check and host check logic. Host and service notifications occur in the following instances:

  • When a hard state change occurs. More information on state types and hard state changes can be found here.
  • When a host or service remains in a hard non-OK state and the time specified by the configuration setting notification_interval in the host or service configuration has passed since the last notification was sent out (for that specified host or service).

Each host and service definition has a contact_groups option that specifies what contact groups receive notifications for that particular host or service. Contact groups can contain one or more individual contacts.

When OP5 Monitor sends out a host or service notification, it will notify each contact that is a member of any contact groups specified in the contact_groups option of the service definition. OP5 Monitor realises that a contact may be a member of more than one contact group, so it removes duplicate contact notifications before it does anything.

The default behaviour is if a host has the option contact_groups configured in the host configuration, that or those contact_groups will receive notifications for the host and the services on the host. There is an exception to this default behaviour:

  • If a service on a host has the option contact_groups set to a different contact group than the one on the host, the contact group on the host will receive all the notifications, except from the service that has a contact group defined.
  • If a service on a hostgroup has the option contact_groups set, only that specific contact group will receive the notification.

Click here to view additional information on implied inheritance.

Notification filters

When a notification is about to be sent it has to go through a number of filters before OP5 Monitor can determine whether a notification really is supposed to be sent or not.

Filter Description
Program-wide This tells OP5 Monitor if notifications are turned on or not in a program-wide basis. Program-wide notification settings are managed in Manage -> Process information.
Service and host filters
  • Is the host or service in scheduled downtime or not?
  • Is the host or service in a flapping state?
  • Do the host or service notification options says that this type of notification is supposed to be sent?
  • Are we in the right time period for notifications at the moment?
  • Have we already sent a notification about this alert? Has the host or service remained in the same non-OK state that it was when the last notification went out?
Contact filters
  • Do the contact notifications options says that this type of notification is supposed to be sent?
  • Are we in the right time period for notifications at the moment, according to the notification time period set on the contact?

Notification commands

The notifications that are sent is defined in either one of the two files below:

  • checkcommands.cfg
  • misccommands.cfg

The commands are divided into:

  • host notification commands — the default ping.
  • service notification commands — all other service checks running against the host.

The notification commands are then using scripts in the same way as the normal check commands does.
All default scripts shipped with OP5 Monitor is located in /opt/monitor/op5/notify.

From the host machine's page in the OP5 Monitor web console:

  • To disable host notifications, toggle the Notifications switch.
  • To disable service notifications, click the Options drop-down menu in the upper right, select Service Operations > Disable notifications for all services.

Notification macros

Many of the arguments sent to the notification commands are macros. The macros are a sort of variables containing a, in most cases, program-wide value. You can read more about macros in the Naemon manual.

One of the most important macro used with notifications is $NOTIFICATIONTYPE$. This macro tells you what type of notification that is supposed to be sent.

The $NOTIFICATIONTYPE$ macro can have one of the following values:

Notification Type Description
PROBLEM A service or host has just entered (or is still in) a problem state.
RECOVERY A service or host has recovered from a problem state.
ACKNOWLEDGEMENT A service or host in a problem state has been acknowledged by a user.
FLAPPINGSTART The host or service has entered a flapping state.
FLAPPINGSTOP The host or service has left a flapping state.
FLAPPINGDISABLED The host or service flapping detection has stopped and has there fore left the flapping state.
DOWNTIMESTART The host or service has entered a scheduled downtime.
DOWNTIMESTOP The host or service has left a scheduled downtime.
DOWNTIMECANCELLED The scheduled downtime for a host or service has been cancelled.
   

The list of macros described in the Naemon manual is useful when you are working with new notification commands and scripts. For more information about macros the list of macros, click here.

URL in notification email

One part of the notification email is a link back to the OP5 Monitor server that has sent the notification, and the hostname in this link can be configured to something else than the sending servers hostname.

This can be useful if OP5 Monitor is configured for Load-Balanced Monitoring or Distributed Monitoring so notifications can be sent out from different sources, but one of the peers is the preferred system for configuration and for viewing data. This is the recommended way to use load balanced systems.

If this setting is configured, the link back to OP5 Monitor in the notification email can be set to always point to one of the load balanced or distributed systems.

The URL back to OP5 Monitor can be configured by creating the file: /etc/op5/notify.yml with a hostname different from the systems hostname displayed by setting the following option /etc/op5/notify.yml

hostname: master1.op5.com

The below example is a notification sent from master2.op5.com in a load balanced configuration:

op5 Monitor

Service CUSTOM detected 2016-07-03 16:36:11.
'Certificate Expiration Check' on host 'master01' has passed the CRITICAL threshold.
https://master01.op5.com/monitor/index.php/status/service/master01

Additional info;
CRITICAL - File /opt/plugins/custom/certificate-expire is 21770079 seconds old

Host:    master01
Address: 172.27.0.12
Alias:   OP5 Monitor Server
Status:  UP
Comment: /etc/op5/notify.yml configured on master2

The link in the notification email will take you to master1 to view the problem in more detail.

Changing "from" in notification e-mail

By default, notifications are sent from the e-mail address "op5monitor" without any domain. The MTA adds the local domain name, which by default is "@localhost.localdomain".

To change the e-mail address that notification are sent from, use the --from argument for the notification command, or reconfigure your MTA and hostname in OP5 Monitor to send the message from the correct domain.

To change the sender e-mail address in the notification command from op5monitor@localhost.localdomain to op5notification@mycompany.com:

  1. Navigate to the check_command configuration under Manage > Configuration > Check Commands.
  2. Enter the host notification command "host-notify" in the search box.
  3. Edit the command_line for the notification command and add "--from op5notification@mycompany.com" without the "-signs.

Example:

command_name=host-notify

command_line=$USER3$/notify/notify --from op5notification@mycompany.com -c "$CONTACTNAME$" -h "$HOSTNAME$" -f "$NOTIFICATIONTYPE$" -m "$CONTACTEMAIL$" -p "$CONTACTPAGER$" "HOSTALIAS=$HOSTALIAS$" HOSTADDRESS=$HOSTADDRESS$" "HOSTSTATE=$HOSTSTATE$" "HOSTSTATEID=$HOSTSTATEID$" "HOSTSTATETYPE=$HOSTSTATETYPE$" "HOSTATTEMPT=$HOSTATTEMPT$" "HOSTLATENCY=$HOSTLATENCY$" "HOSTEXECUTIONTIME=$HOSTEXECUTIONTIME$" "HOSTDURATION=$HOSTDURATION$" "HOSTDURATIONSEC=$HOSTDURATIONSEC$" "HOSTDOWNTIME=$HOSTDOWNTIME$" "HOSTPERCENTCHANGE=$HOSTPERCENTCHANGE$" "HOSTGROUPNAME=$HOSTGROUPNAME$" "HOSTGROUPALIAS=$HOSTGROUPALIAS$" "LASTHOSTCHECK=$LASTHOSTCHECK$" "LASTHOSTSTATECHANGE=$LASTHOSTSTATECHANGE$" "LASTHOSTUP=$LASTHOSTUP$" "LASTHOSTDOWN=$LASTHOSTDOWN$" "LASTHOSTUNREACHABLE=$LASTHOSTUNREACHABLE$" "HOSTOUTPUT=$HOSTOUTPUT$" "HOSTPERFDATA=$HOSTPERFDATA$" "HOSTACKAUTHOR=$HOSTACKAUTHOR$" "HOSTACKCOMMENT=$HOSTACKCOMMENT$" "NOTIFICATIONNUMBER=$NOTIFICATIONNUMBER$" "CONTACTALIAS=$CONTACTALIAS$" "DATETIME=$DATETIME$" "SHORTDATETIME=$SHORTDATETIME$" DATE=$DATE$" "TIME=$TIME$" "TIMET=$TIMET$" "HOSTACTIONURL=$HOSTACTIONURL$" "HOSTNOTESURL=$HOSTNOTESURL$" "ADMINPAGER=$ADMINPAGER$" "ADMINEMAIL=$ADMINEMAIL$" "NOTIFICATIONCOMMENT=$NOTIFICATIONCOMMENT$"

To change this for the service notifications, you need to repeat the steps above on the command "service-notify" as well.

Additional Resources

The notifications in OP5 Monitor follows a extensive rule set that is inherited from the core daemon Naemon. More documentation can be found in the notification documentation for Naemon.