OP5 Monitor ["OP5 Monitor"]
["User Guide"]

Key concepts of OP5 Monitor

Monitoring objects

OP5 Monitor can monitor any physical or virtual entity in a network. Monitoring is based on the following conceptual objects.

Hosts and host groups

Hosts are the central object in the monitoring logic. Hosts have the following characteristics:

  • Hosts are any physical or virtual devices in a network, such as servers, workstations, routers, switches and printers.
  • Hosts have an IP address.
  • Hosts are normally associated with one or more services, either directly or inherited from host groups.
  • Hosts can have parent–child relationships with other hosts, often representing real-world network connections, which are evaluated by OP5 Monitor in its network reachability logic.

A host is normally placed in one or more host groups together with other hosts. For example, you can use host groups to:

  • Group hosts from the same geographic area.
  • Group hosts of the same type.
  • Group hosts for a particular service.
  • Place a customer's host in a host group of its own.

Services and service groups

A service is something that can be measured on a host, such as system load, disk usage, database connection times, and number of logged in users. Services have the following characteristics:

  • Services must be connected to hosts.
  • Services can perform checks using various means, such as TCP, agents, and SNMP.
  • Services use a check command to communicate with plugins that fetch data.

One of the most useful things about service groups is that you can group them according to the service they are providing for your customers. For example, in a scenario where you are providing an email service to your customers, the email service needs the following components to be working as expected:

  • DNS
  • MTA
  • IMAP or POP server
  • Webmail
  • Storage

Each of the components has essential services that must be working, otherwise your customer will not be able to use your email service. You can place all the essential services in one service group so that you can easily spot alerts and notifications that put the email service at risk.

Parenting

The hierarchy of the objects being monitored is important for OP5 Monitor when diagnosing problems in a network. When a parent host is down, for example, all of its children become unreachable. In the hierarchy in this diagram, host fw-01 is a parent to other hosts in the network:

If fw-01 goes down, then the other hosts become unreachable. You can configure OP5 Monitor not to send notifications for a host which is unreachable. For more information on how to configure parenting, see Configure a host or service in Manage hosts and services.

Checks, alerts, and notifications

OP5 Monitor runs host checks according to a predefined check interval, or on demand.

OP5 Monitor generates an alert every time a service or host changes state, such as when an unreachable host comes back online, or when a service which was working as expected has started showing a warning.

By default, the check interval is five minutes.

Monitoring states

When OP5 Monitor encounters a problem state, it is classified as a soft problem until the number of checks reaches the configured threshold, max_check_attempts. When the threshold is reached, the problem is reclassified as hard, and OP5 Monitor sends out a notification about the problem. OP5 Monitor does not send notifications for soft problems.

On-demand checks

OP5 Monitor performs on-demand checks in certain specific scenarios. For example:

  • When a service associated with the host changes state.
  • As needed as part of the host reachability logic.
  • As needed for predictive host dependency checks.

OP5 Monitor primarily runs on-demand checks when a service changes state. This is needed to ensure that the host goes into a hard critical state before its associated services when there is a full server outage. The reason for this is that if the host is in a hard critical state before any of its services, then OP5 Monitor only sends one host notification. Without the on-demand checks, a situation may occur where multiple services end up in a hard critical state before the host, thus generating a notification storm, one notification for each service.

Notifications

Users who are defined as contacts and linked to the relevant services and hosts receive notification messages according to the configuration you define in OP5 Monitor. A notification is always associated with an alert, but not all alerts result in a notification. OP5 Monitor can send notifications to contacts by email or SMS. With some additional configuration, it can also send notifications to other destinations such as databases and ticketing systems.

For more information on how OP5 Monitor sends out notifications for host and service problems, see Manage notifications.

Monitoring agents

Checks are performed by agents running scripts at regular intervals on the hosts and reporting back to an OP5 Monitor plugin. OP5 Monitor displays the results of the checks in one unified view, irrespective of the agent, host, and service type.

Note: You can also set up OP5 Monitor to run agentless checks or use the OP5 Monitor API.

You can use the following agents with OP5 Monitor:

  • SNMPv3 for Unix
  • NRPE for Unix
  • NSClient++ for Windows

SNMPv3

SNMPv3 offers secure authentication and encryption. The net-snmpd agent is available in the default package repository for most Unix operating systems. SNMPv3 also supports running existing NRPE plugins.

To use SNMPv3, you need to configure SNMP daemons on the hosts. For more information, see Set up a Unix SNMP agent in Additional server and software setup.

NRPE

NRPE (Nagios Remote Plugin Executor), is a Unix client for executing plugins on remote hosts. As part of Naemon's backward compatibility with Nagios plugins, OP5 Monitor works with NRPE.

NRPE is used in combination with a set of local plugins. You can use any of the plugins on the OP5 Monitor server.

NRPE does not offer as much as security as SNMPv3, but you can tighten security using SSL.

For more information about installing NRPE for monitoring Unix hosts, see Set up a Unix NRPE agent in Additional server and software setup.

NSClient++

NSClient++ integrates well with Windows Servers, copying its commands into the registry and simplifying authentication. OP5 Monitor handles NSClient++ communication through the check_nrpe plugin.

To use NSClient++ to monitor Windows servers, you need to install it on the hosts. For more information about installing NSClient++ for monitoring Windows hosts, see Set up a Windows agent in Additional server and software setup.

Agentless monitoring

You can use the following components for agentless monitoring:

  • WMI — for Windows servers.
  • check_by_ssh — for Linux servers.
  • SNMPv3 — for XEN and KVM servers, as well as SNMP-capable networking equipment.

Comparison of agent and agentless monitoring

The following comparison table summarises the different aspects to consider when choosing an agent or agentless approach to monitoring in OP5 Monitor.

  NSClient++ SNMP NRPE SSH WMI
Unix Partly Yes Yes Yes No
Windows Yes No ** No No Yes
Network equipment No Yes No No No
Can run standard monitoring plugins No Yes Yes Yes No
Authentication of client IP address Username and password IP address Public or private keys Username and password
Encryption Good Good Bad Good Good
Performance Good Good Good Good * Low
Central threshold management Yes Yes **** No *** Yes Yes
Local threshold management Yes No Yes Yes No
Custom plugins or commands Yes Yes Yes Yes Yes
* Very fast using SSH multiplexing.
** SNMP on Windows is deprecated and not recommended by Microsoft.
*** Thresholds can be managed centrally, but not securely.
**** Using pass_persist.