Hosts

Overview Copied

In Opsview Cloud, a Host is an autonomous computing device, such as a server, virtual server, a collector server, database server, workstation, PC, network device, storage device, sensor, tablet, and mobile device.

Hosts are effectively logical end-points, meaning if you wish to monitor an Oracle database on a Host, you add the Host. Conversely, if you wish to monitor a VMware vSphere server running 64 guests, you can add that as a Host or you could add each guest individually as a Host, allowing the monitoring and alerting on the per-guest metrics such as CPU usage and so forth.

The creation, modification and deletion of Hosts is done via the Hosts page from the Configuration menu.

You can also choose to add Hosts via an Autodiscovery automated scan (see Autodiscovery) or an Automonitor scan, see AutoMonitor.

The Host settings section comprises of a sortable/filterable grid view containing all of the Hosts within the Opsview Cloud system. Each column header can be filtered on relevant information, i.e. a User can filter the list to show only Hosts that have a given Host Template applied, or show only Hosts that are members of a given Host Group. When a filter is in place the column header changes color:

Hosts window

In the top left, there are six buttons:

Hosts toolbar

Clear filters: This button will clear all filters applied via the column headers. Note that this will only be seen when a filter is applied.
Add new: This button loads a modal window which allows you to add a new Host.
Edit or Bulk Edit: This button loads a modal window which allows you to change the same settings for one or more Hosts, depending on how many are selected with check boxes.
Delete: This button allows you to delete one or more Hosts selected with check boxes.
Export: This button allows you to download the list of Hosts and their relevant data (Host Group, Host template(s), etc) in the chosen format.
Refresh icon: Reload the data within this page.

In the bottom left, a set of controls enables you to move quickly between the paginated list when a large number of Hosts are in the system.

Hosts pagination

A drop-down menu allows you to configure the number of Hosts visible on the page. There are options for ‘5, 10, 25, 50, 100 and 250’ Hosts per page. There is also a string of text highlighting the limit of Hosts that can be added to this system, based on your licensing plan.

The Host list page by default will list all Hosts within Opsview Cloud that your role can see.

However, when one or more Hosts have been modified and are in a pending state, i.e. when modifications have been made but the Apply Changes hasn’t been performed yet, the Host list page will have a new section added at the top which will display all modified Hosts.

Hosts in pending state

Specific hosts may be marked as special by the addition on an icon which means some actions may be limited on these hosts. These icons are as follows:

Icon	Server Type
	Orchestrator
	Collector
	Flow Source

The Apply Changes button will open the Apply Changes modal window where you can submit your modifications and effectively put them into production.

Working with Hosts Copied

Single Hosts Copied

You can edit a single Host by either double clicking on the row or clicking on the contextual menu button in the row and selecting the appropriate action.

Multiple Hosts Copied

You can also select a number of hosts to work with at same time via the checkboxes; this is known as Bulk Edit. The following behavior is possible:

Click on a checkbox to select.
Click on a selected checkbox to deselect.
Click on an unselected checkbox to add to selection.
Shift-click to select a range from the last focused row to the current row, to add to selection.
Control-click to clear selection and choose this row only.

To use the keyboard, use the mouse to select a row in the grid then:

Press up/down to highlight rows.
Press space to select a new row.
Press space to deselect a row if it is already selected.
Press shift-up or shift-down to select a range of rows to add to selection.
Press control-space to clear selection and choose this row only.

Note
The All Hosts grid (bottom half of the page) and the Modified Hosts grid (top half of the page) will keep in sync with the selection of hosts.

As selection changes, the total number of hosts will be displayed in the grid header:

Number of hosts selected

The buttons in the top right will update based on the selection:

No selection:

No selection

Single selection:

Single selection

Multiple selection:

Multiple selection

The Select All checkbox in the header can be used to select or deselect all hosts matching the current filtering. There will be a slight delay for the checkbox header to update as a backend call will be needed to get a list of all appropriate hosts.

The buttons in the top right will update based on the selection:

Adding or Editing a Single Host Copied

When a Host is edited or the Add New button is clicked, a modal window will appear. A similar pre-filled window will be prompted if you click on the host contextual menu and select Edit.

The modal window is split into five tabs (the additional NetAudit tab is displayed if you have a subscription for the Network Analyzer feature):

Host
Notifications
Service Checks
SNMP
Variables
NetAudit

Host tab Copied

New Host tab

The Host tab is the main configuration window when adding or configuring a Host. It is split into two sections, Basic and Advanced. The Basic section is the main settings a Host needs to have configured. Items denoted with a red asterisk (*) are mandatory fields.

In the Basic section there are four options that need to be configured:

Primary Hostname/IP Copied

In this field should be the network address of the Host; either an IP or a domain resolvable by Opsview. The network address entered in this is used by the system macro $HOSTADDRESS$ , which is used throughout the entire Opsview system.

Host Name Copied

This is the user-friendly name of the Host and is displayed in the Opsview Cloud. If your network address is 192.168.123.123, you may want to give the Host an alternative name, for example Router. This field must be unique in the system.

Host Group Copied

A Host Group is a container for one or more Hosts. In this drop-down list, all available Host Groups will be listed. Host Groups containing other Host Groups won’t be listed - Host Groups can only contain either all Hosts, or all Host Groups but not a mixture. See Host Group documentation for more details.

Monitored By Copied

This option is only visible when you have Collector Clusters set up for Distributed Monitoring and defines which Cluster will do the actual monitoring of the Host. This provides the ability to distribute the monitoring load across more servers.

The list of monitoring clusters will only show visible clusters (see Role Configuration) and is listed in the order of:

Master Monitoring Server (the central orchestrator)
The remaining clusters, in alphabetical order

This field may be entirely hidden if:

this host is the central orchestrator host.
this host is used as a collector.
this host is used as a flow collector.
there are no other monitoring clusters.
the user does not have permission to see this monitoring cluster.

Host Templates Copied

Host Templates are a group of Service Checks that can be applied to a given Host. Host Templates provide the ability to monitor certain technologies; for example, if the Host you are adding is an Oracle database, apply the ‘Database - Oracle RDBMS’ Host template by selecting it in the left-hand column and clicking the right arrow:

Host Templates

In the Advanced section there is a range of optional settings that can be configured for the Host:

Advanced section

Other Hostnames/IPs Copied

This is a comma-separated list of other network addresses relating to the Host. For example, if the Host has two IP addresses, you may enter the first IP address in the Basic > Primary Hostname/IP field, and the second IP address in this field.

The primary Hostname/IP is addressed using the $HOSTADDRESS$ macro, whereas all comma-separated values entered in this field are addressed as $ADDRESS1$ , $ADDRESS2$ and so forth. To use these values in a Service Check instead of the Primary Hostname/IP, simply replace $HOSTADDRESS$ with $ADDRESS1$ , for example.

If other addresses are not specified in this field yet the $ADDRESSx$ macro is used, Opsview will default the value to the Primary Hostname/IP instead. This field is also used for relating these IP addresses to this Host for the purpose of SNMP trap processing.

Description Copied

Free text entry field, this field is purely for describing the Host and is not used elsewhere within Opsview Cloud.

Host Check Command Copied

The Host Check Command is used to determine the Host status which can be one of three statuses:

UP - Responding to the Host Check Command
DOWN - No response
UNREACHABLE - Host has a parent relationship configured and the parent is in a DOWN state

By default, the Host Check Command is ping, therefore if ICMP traffic is blocked between the Host and Opsview Cloud you should change the Host Check Command to one that is allowed to traverse the network, inbound to the Host.

Icon Copied

The icon is used within the Navigator in the Monitoring menu to identify the Host, along with being visible in the Host list page. Opsview ships with a series of default icons that can be chosen via the drop-down box.

To upload your own icon use the hosticon_admin script via the command line. This script is located within /opt/opsview/coreutils/bin/. As the root user, run the command:

    hosticon_admin add "LOGO - Hosticon" /path/to/Hosticon.png

where "LOGO - Hosticon" is what you wish the icon to show as within the dropdown menu, and /path/to/Hosticon.png is where the image is you wish to convert into an icon. To delete a Host icon, run the following command as the opsview user:

    hosticon_admin remove "LOGO - Hosticon"

To list all of the icons within Opsview Cloud run the command:

    hosticon_admin list

You may need to install the package imagemagick (Debian/Ubuntu) or ImageMagick (RHEL/OL) to use this functionality.

Check Period Copied

The check period is a choice of a list of Time Periods available within Opsview. Time Periods are essentially a weekly format which allows a user to create a time period called working hours, for example, that is Monday to Friday, 9:00 am to 5:00 pm.

When this time period is applied to a Host, this Host is only monitored during the specified times of the Time Period.

Check Interval Copied

Working in combination with the check period and the Host Check Command, the Check Interval is how regularly the Host is checked using the Host Check Command during the specified time period. If set to 5m (default) and all settings are left to default, the Host will be pinged once every five minutes when the time period is valid (the Host is being monitored).

This field allows for hours (h), minutes (m) or seconds (s), which means 24h refers to once a day, 30s refers to every 30 seconds. The field can also be set to 0 which means the Host is always considered UP unless a check has been manually requested (Recheck is run against the Host via its contextual menu). This field must be greater than or equal to 0.

Max Check Attempts Copied

This field determines the number of times a Host Check Command has to fail for the Host to change into a hard state. In Opsview there is the concept of Soft and Hard states.

When a Host check fails and the Host changes into the DOWN state it is considered a Soft state. After the Host Check Command has failed for the number of times specified in this field is considered a hard state, meaning it’s not a temporary blip. You can use hard states so that they are only notified when a Host is truly down. The interval used here is not the check interval but the retry interval. This field must be greater than or equal to 1.

Retry Interval Copied

A separate field to the Check interval, the Retry Interval is only used when a Host goes into the DOWN or UNREACHABLE state. For a Host to go from a soft state to a hard state, the Host Check Command must fail $X number of times, where $X is the value set in this field. For example, if the Retry Interval is 1m and the Max Check Attempts is set to 3, the Host Check Command will run once a minute for two further minutes (the first failure is what triggers the retry)’ after which if the Host is still DOWN it will change from a soft DOWN to a hard DOWN. This field must be greater than 0 and must be less than Check Interval.

Hashtags Copied

Covered in greater detail within the Hashtags section, this drop-down is a list of all Hashtags within Opsview Cloud. By selecting one or more Hashtags from this drop down menu you are ’tagging’ the Host with the Hashtag. This means when you tag a Host with ’linux-systems’, anyone whose role allows them to view Hosts tagged with ’linux-systems’ will be able to view this Host. Similar logic applies for Notifications.

Globally Applied Hashtags Copied

When a hashtag is applied from the Configuration > Hashtags menu and not via the Host, it will appear in this list. To remove the hashtag from the Host simply edit the relevant Hashtag via Configuration > Hashtags and click on the hashtag in question.

Event Handler Copied

Covered in greater detail in the Event Handler section of the User Guide, Event Handlers are scripts that can be triggered when a Host goes into a DOWN or UNREACHABLE state (soft/hard, depending on the event handler script). The script can do anything you like, but a common usage includes restarting a service or server (virtual machine, for example) via an API.

Always execute Copied

If this is ticked, then every result received for this Host check will cause the event handler to be executed. If this is unticked, the event handler will be executed only when a state change occurs.

Parents Copied

This relationship is used to calculate if a Host is DOWN or UNREACHABLE; if the dependencies for the Host mean the Host is really down or if something in the middle is hiding the true state of the Host. Use this to relationship to minimize Notifications as you can disable Notifications for UNREACHABLE Hosts.

For example, if you have a switch as the parent of 10 Hosts and the switch is marked as DOWN, then when the 10 Hosts are checked and considered DOWN, they will be marked as UNREACHABLE instead and you will only get one Notification for the switch instead of 10 Host Notifications. There may be a delay in this eventual condition as results will be coming in at different times. You can select multiple parents, if you have a failover capability.

Note
Despite the UI/API allowing it, you should not set parent or child relationships between the collectors themselves in any monitoring cluster, as collectors do not have a dependency between each other and are considered equals.

Notifications tab Copied

Notifications tab

The Notifications tab contains various settings relating to when and why Notifications are sent for this particular Host.

Notify On Copied

This section determines which states the Host should notify on; for example, only on DOWN or UNREACHABLE. If a Host does not notify on any states, then the services on that Host will also not send any notifications.

Notification Period Copied

This field uses the Time Periods already defined within Opsview, and determines when notifications are allowed to be sent to users

Flap Detection Copied

This checkbox toggles flap detection on and off. Flap Detection is used in notifications and other areas of Opsview, meaning, don’t send me an alert if the Host is flapping. A Host is considered flapping if it changes state between OK and non-OK more than seven times in the last 20 checks.

Notifications for Flapping starting and stopping will not be sent when notifications are suppressed (such as when acknowledged or in downtime) or when flapping started during downtime but continues after downtime ends.

Service Checks tab Copied

Service Checks tab

The Service Checks tab is designed to give you the ability to:

Add Service Checks to a Host.
Modify Service Checks on a Host basis, i.e. use different arguments just for this Host.
Omit Service Checks that have been inherited via a Host Template; this can mean “we don’t want this service check on this Host but we want the rest from the Host Template”.
Test Service Checks against a Host before submitting the change and Apply Changes.

The left-hand section of the Service Checks tab displays the Service Check tree. Service Checks reside within Service Groups, such as the checks visible above, such as CPU statistics, live within the service group OS - Base Unix Agent. The algorithm behind the tree structure creation uses the hyphens as the separator, therefore OS - Base Unix Agent becomes OS at the top level, and Base Unix Agent at the 2nd level down.

In the tree on the far right of the Service Checks’ row (the items with the check boxes) is the location where one of two icons will potentially be displayed. These icons depict whether this Service Check is inherited from a Host template or whether it was originally inherited from a Service Check and has since been ‘omitted’, i.e. ‘don’t apply this Service Check to this Host’. This ‘omit’ option is toggled by using the ‘Remove Service Check from Host Template’ option within the Service Check, and only becomes visible when the Service Check is checked in the left hand section.

Service Checks config

If a Service Check is inherited from a Host template yet isn’t ‘checked’ in the left-hand section, the ‘Exceptions’ section will not be editable and the ‘Remove Service Check from Host Template’ toggle button will not be visible. To edit these items, checking the box next to the Service Check tells Opsview Cloud to look at this section for information on this Service Check instead of the Host Template.

The right-hand section of the Service Checks tab is populated with information and options relevant to the selected Service Check and is commonly referred to as the ‘Service Check information panel’.

When no Service Check is selected, this section will contain a message informing you to select a Service Check first.

Service Checks tab

When a Service Check is selected and checked in the left-hand tree panel, the Service Check information panel will show:

Service Check name.
Service Check description.

The ‘Service Check information panel’ also contains:

Plugin and Macro Help buttons.
Test Service Check drawer.
Variables drawer.
Exceptions drawer.
Timed Exceptions drawer.
Event handler drawer.

The Test Service Check drawer is designed to provide the ability to test that a Service Check will perform as expected against the relevant Host. This saves time by reducing the cycle of submitting then applying the changes to Opsview to check the result.

Example ‘Test Service Check’ output

The Plugin Help button will load a new modal window displaying the ‘Help file’ for the plugin:

Plugin Help

The Macro Help button will load a new modal window displaying all of the host specific macros:

Macro Help

The ‘Test Service Check’ accordion allows you to test that the Service Check definitions would run correctly on this host:

‘Test Service Check’ accordion

Note
You cannot change the arguments used, so it will be testing the arguments defined for this active Service Check.

Host variables Copied

The Variables drawer contains all variables that the Service Check may be using. Variables act like standard computer science Variables, in that you can configure -p %PORT% instead of -p 9200 for the Service Checks argument. The benefit of this is that by using a Variable instead of hard coding the port, you can apply the Service Check to hundreds of Hosts and simply add the ‘%PORT%’ Variable to the Host’s variables to switch the port.

If a Service Check requires a Variable in order to successfully work, then the Variable will be listed within the Variables drawer. In the example Service Check below we are applying a Service Check to monitor the number of Bytes received for a MySQL database. The syntax for this Service Check is:

-H $HOSTADDRESS$ -u %MYSQLCREDENTIALS:1% -p %MYSQLCREDENTIALS:2% --metricname=Bytes_received

This means that the username field (-U) and the password field (-P) are located within the %MYSQLCREDENTIALS% attribute. By default, the Variables drawer will be empty. It will only be populated with the Variables required once you have pressed the ‘Test’ button:

Variables drawer

Variables used

If these Variables are not populated with global defaults (Configuration > Variables), then the Service Check will fail as there is no means to log in to the MySQL database in order to monitor it. If that is the case then you will need to click the ‘Add’ button next to the Variable, which will navigate to the ‘Variables’ tab and add a new Variable as below:

Variables tab

Here you will need to enter a value (not relevant to this Service Check so enter anything) and click ‘Save’. Once saved the ‘Host variable details’ panel will populate. Here you can now check both ‘Override username’ and ‘Override password’ and enter the correct login information for this database.

Credentials for variables

Once the correct information is added, navigate back to the Service Checks tab and click ‘Test’ again and if you have added the correct credentials, the Service Check should now successfully work:

Test Service Check with correct credentials

You can now Submit Changes and use the Apply Changes to Opsview Cloud knowing that the Service Check will work when applied to production.

If you wish to change the actual plugin arguments themselves (i.e. add a warning/critical level (-w/-c) to the Service Check), then you can do so via the Exceptions drawer.

As covered earlier in this section, the Exceptions drawer will not be visible until the Service Check is checked in the left-hand tree pane. Once checked and the Exceptions drawer is opened, this will be displayed:

Exceptions drawer

Tick the checkbox to confirm you want to amend the default arguments.

For the MySQL Aborted Connections Service Check you may wish to amend the -c option from 30 to 35; use the Plugin Help modal for direction on how to modify these arguments. Press Test to then run the amended command:

Exceptions amended command output

The Timed Exception option works exactly the same, however, the defined arguments will not be ‘injected’ into the Service Check until the relevant time period begins.

The Event Handler accordion allows you to have a script execute when state changes occur for this Service Check on this Host. See the Event Handler documentation for more details.

Removing Host variables Copied

To remove multiple host variables, check the box next to each variable you want to delete. Once you’ve made your selections, click Remove.

Remove button in Variables tab

Filtering Host variables Copied

The Filter feature allows you to filter the list of variables by name or value.

Filter in Variables tab

SNMP Tab Copied

SNMP tab

The SNMP tab is where you can configure SNMP credentials for a Host. For example, if you wish to use plugins or Service Checks which rely on SNMP then the relevant SNMP credentials will first need to be configured and tested within this section.

The tab is split into two sections:

Enable SNMP
Credentials

The Credentials section is visible only when Enable SNMP is enabled. Otherwise, this section remains hidden. Additionally, enabling SNMP makes the Interfaces tab visible, allowing you to query and configure the host interfaces for monitoring.

Credentials Copied

The Credentials section is where you select the SNMP version used, along with the relevant authentication information. For SNMP v1 or v2c, only the port and community string need to be specified.

Upon first entry to the Credentials, the SNMP community string field (v2c as an example) displays the message SNMP community encrypted - click to reset. This message means that a secure, encrypted placeholder is in use until a valid community string is set.

SNMP community encrypted - click to reset button

To update the string, click the reset button as directed. Enter the SNMP community string, then select Test SNMP Connection to verify that the credentials have been entered correctly.

SNMP credentials

If the credentials are incorrect, the error message Cannot connect with SNMPv2c will appear.

SNMP credentials error

For SNMP v3, additional fields appear for configuration.

SNMP v3 credentials

You must specify all the fields for port, username, authentication protocol, authentication password, privacy protocol, and privacy password.

After entering the credentials, select Test SNMP Connection to verify that they have been entered correctly. Then select Submit Changes to save your configuration.

Note
To ensure security, once authentication data is entered in the UI, it cannot be retrieved. However, you can reset it at any time.

For a breakdown of the relevant credential information, refer to the following tables:

SNMP v1 and v2c Copied

Field	Details
SNMP Port	This defines the port number to connect to the SNMP device. Default is 161.
SNMP Community	This defines the community string to connect to the SNMP device. This value will be encrypted in the Opsview Cloud database. After this value has been saved, it cannot be retrieved back in the user interface. If you want to change the value, click the Reset button to change it.

SNMP v3 Copied

Field	Details
SNMPv3 Username	This defines the SNMPv3 username to connect to the SNMP device.
SNMPv3 Authentication Protocol	This defines the SNMPv3 protocol to connect to the SNMP device to authenticate the user. These are the following valid values: md5 sha1 (SHA-1) sha224 (SHA-224) sha256 (SHA-256) sha384 (SHA-384) sha512 (SHA-512)
SNMPv3 Authentication Password	This defines the SNMPv3 password to connect to the SNMP device to authenticate the user. This value will be encrypted in the Opsview database. After this value has been saved, it cannot be retrieved back in the user interface. If you want to change the value, click the Reset button to change it.
SNMPv3 Privacy Protocol	This defines the SNMPv3 protocol to encrypt traffic between Opsview and the SNMP device. These are the following valid values: des aes aes128 aes256 aes256c The des, aes256, and aes256c options are only fully supported on some operating systems.
SNMPv3 Privacy Password	This defines the SNMPv3 password to encrypt traffic between Opsview and the SNMP device. If this is not set, then no attempt to encrypt traffic will take place. For devices using Net-SNMP, an empty privacy password will still allow connection to the device even if a privacy password is defined for a user. This value will be encrypted in the Opsview database. After this value has been saved, it cannot be retrieved back in the user interface. If you want to change the value, click the Reset button to change it.

Note

For backwards compatibility with argument strings, any occurrences of $$ will be replaced with $ when processed by the system to run checks. This includes text within SNMP community strings (v2c or v3).

However, this behavior will be removed in a future version. We recommend using single $, which will be processed as it is. This ensures future compatibility and avoids potential issues.

Interfaces tab Copied

Enabling SNMP in the SNMP tab makes the Interfaces tab visible. This tab lists all available interfaces on the host. This list is gathered via SNMP, so correct credentials and a properly configured SNMP daemon are prerequisites.

Interfaces tab

Note

Please note that in order to monitor the interfaces of a Host, you must apply the ‘SNMP ’ MIB II’ Host template before a performing an Apply Changes. This template is comprised of the Service Checks ‘Interface Poller’, ‘Interface’, ‘Discards’ and ‘Errors’.

When a host has a large number of interfaces (1000+), it may take a long time to fetch the interface data from the host. By default this service check is given 120 seconds to execute.

This time limit can be modified in the Executor configuration, contact ITRS Support for assistance.

To view the interfaces of a Host click on the ‘Query Host’ button which will populate the table with the available interfaces. There are a few options you may wish to modify before running the query:

Extended Throughput Data Copied

If this option is enabled then the Interface Service Check will also return unicast, multicast and broadcast performance data. This will be in the form of bits per second based on the interface speed.

SNMP Message Size Copied

Some SNMP devices can return a significant amount of data which fills the standard SNMP buffer size of around 500 octets. Many devices cannot cope with setting the maximum buffer size so this option allows the size to be tailored to each device. The units are in Kio which are multiples of 1024.

Use SNMP GetNext Copied

Recent SNMP devices use SNMP GetBulk to obtain information, which older devices do not support. This forces the use of the older protocol.

Use SNMP ifName Copied

Older SNMP devices often provide interface IDs only through ifDescr, which can be duplicated. More recent devices typically support ifName, which is selected by default. If you change this setting, the service check names will be different, and history and graphs will be lost. Additionally, if you have assigned any interfaces or service checks to a Hashtag, you should update the Individual Tagging and Interface Tagging fields in the Hashtag configuration.

Modify ifDescr Level Copied

Some SNMP devices can have very long descriptions (ifDescr) for each interface on a device, mostly made up from common words. There is a limit in Opsview Cloud that this description shouldn’t exceed 52 characters otherwise monitoring the interface will not work as expected (a ‘duplicate interface’ error may be shown at the bottom of the screen). Setting this option can remove common words to reduce the length of each interface ifDescr and help to avoid duplicate interfaces.

The settings are as follows:

Setting	Words Removed
Off (default)	None
Level 1	‘Nortel Ethernet’, ‘Nortel’, ‘Routing’, ‘Module’
Level 2	Trailing spaces removed
Level 3	‘PCI Express’, ‘Quad Port’, ‘Gigabit’, ‘Server’
Level 4	‘Corrigent systems’, ‘, , '
Level 5	‘Ethernet’, ‘Frontpanel’, ‘RJ45’, ‘1000BASE-T’, ‘- no sfp inserted’
Level 6	‘Avaya’, ‘Virtual’, ‘Services’, ‘Platform’

Levels are cumulative. Further levels may be added in the future. The level should not be changed once monitoring is working to prevent loss of historical data.

Interfaces to poll

The table section of the Interfaces tab has the following main columns:

Selection box: Check box. You can check this to monitor the interface. If you select an interface using the check box beside the name, Opsview Cloud will create a service for each interface after the Apply Changes is performed. This monitors throughput, errors, and discards. Use the checkbox in the column header to toggle all interface checkboxes.
Interfaces to poll: The description of the interface.
Alert Type: see Alert Type.
Throughput: see Discards, Errors and Throughput Thresholds.
Errors: see Discards, Errors and Throughput Thresholds.
Discards: see Discards, Errors and Throughput Thresholds.

The Filter feature allows you to filter the list of interfaces by the interface name, alert type, throughput, errors, or discards values. It also filters by the text within the parenthesis, so admin:'up' filters by all the interfaces that have administrative states set to up.

Note
The text is not saved to the database, and it is only displayed after Query Host is clicked.

Discards, Errors and Throughput Thresholds Copied

For the discards, errors and throughput fields a threshold can be set. For any selected interface, if the cell is empty, the threshold value will be taken from the default line. If a cell is set to - then no threshold will be set. This is equivalent to saying “I do not want to set a warning threshold”.

Throughput is monitored from the multiple service check called Interface. This calculates the rate of throughput between checks and returns the input and output information. If the rate is above the threshold value, then an alert will be raised at the appropriate level.

Performance data will be returned based on the input and output rate in octets per second. If the threshold is specified as a percentage value, the performance data returned will be a percentage value instead.

If a percentage threshold is not specified and it is not possible to work out the interface speed (such as VLANs), then the plugin will return a WARNING with the message:

INTERFACENAME throughput (in/out) X bps/Y bps but has an interface speed of 0, so cannot check a percentage threshold

You should set the threshold to be based on bits per second for this interface, rather than using a percentage threshold.

It is possible to use advanced syntax for more complicated threshold checking. For example:

IN 10:50% — alert if input throughput is below 10% or above 50%.
OUT 30000:50000 — alert if output throughput is below 30,000 bits/sec or above 50,000 bits/sec.
IN 10:50% and OUT 30:55% — alert if both input throughput is below 10% or above 50% and output throughput is below 30% or above 55%.
IN 10:50% or OUT 30:55% — alert if either input throughput is below 10% or above 50% or output throughput is below 30% or above 55%.
40:60% — this is the same as IN 40:60% or OUT 40:60%.
75% — this is the same as 0:75% which was the old behavior.

Most whitespace is ignored. You cannot mix percentage and bits per second values in the same threshold.

Errors are monitored from the multiple Service Check called Errors. This calculates the average number of errors per minute between checks, and returns the input and output error per minute information. If the rate is above the threshold, then an alert will be raised at the appropriate level. Performance data will be returned based on the input and output errors per minute.

Discards are monitored from the multiple Service Check called Discards. This calculates the average number of discards per minute between checks, and returns the input and output error per minute information. If the rate is above the threshold, then an alert will be raised at the appropriate level. Performance data will be returned based on the input and output errors per minute.

Note
If you do not want to monitor throughput, errors, or discards for a particular host, you can remove the service check Interface, Errors or Discards from the host.

Alert Type Copied

The Alert Type drop-down menu offers three options:

Normal
Dormant
Ignore

By default, this field is set to Normal, which indicates that the admin status and link status of the interface are expected to be up. Conversely, if the admin status and link status are expected to be down, the field should be set to Dormant. However, if the status of the interface is not relevant, the field can be set to Ignore.

When the Interface Service Check runs for an interface with the Normal Alert Type set, it will report:

CRITICAL status if admin status is up but link status is down.
WARNING status if admin status is down but link status is up.
OK status if admin status and link status are both down.
A status according to the configured thresholds if admin status and link status are both up.

When the Interface Service Check runs for an interface with the Dormant Alert Type set, it will report:

OK status if admin status is up but link status is down.
OK status if admin status is down but link status is up.
OK status if admin status and link status are both down.
CRITICAL status if admin status and link status are both up.

When the Interface Service Check runs for an interface with the Ignore Alert Type set, it will report:

OK status if admin status is up but link status is down.
OK status if admin status is down but link status is up.
OK status if admin status and link status are both down.
A status according to the configured thresholds if admin status and link status are both up.

For any Alert Type, the Errors and Discards Service Checks will report OK if an interface is down.

SNMP limitations Copied

You need to have SNMPv2c if you are monitoring an interface of 100Mbs or over. This is because SNMPv2 supports 64bit counters, but SNMPv1 doesn’t. If you use SNMPv1, your graphs are likely to have gaps in them.

Interfaces are monitored by name, so if the SNMP index position changes (which could happen on a router reboot), then a rescan of the device will occur to check (Opsview Cloud treats the SNMP index as an internal number which a system does not need to know about. By working with names only, Opsview Cloud can automatically follow any changes to the SNMP index position without human intervention).

If there are multiple interfaces with the same name, the ifIndex will also be passed to the plugin to check. If the ifName does not match the expect interface name for this ifIndex, an alert will be raised which says:

WARNING - Interface name $user_specified_ifname expected at index $user_specified_index, but got $name!

You will need to run Query Host to list the interfaces to check again.

Note
If the index moves to a position with the same interface name, then Opsview Cloud will not see a change and will continue monitoring this interface as usual even though it could be a different interface. If you have a Cisco router, please check this Cisco support article regarding ifIndex persistence.

Getting a ‘Cannot connect with’ error when running ‘Test SNMP Connection’ Copied

Aside from invalid credentials, Test SNMP Connection may fail when the host’s IPv6 address takes precedence over its IPv4 address in /etc/hosts. The SNMP daemon tries to connect with IPv6 but fails as it is only listening on IPv4 by default.

To fix this, you can configure the SNMP daemon to listen on IPv6. You can do this by specifying the agentAddress directive in /etc/snmp/snmpd.conf as:

agentAddress udp:<host_primary_ipv4_address>:161,udp6:[<host_primary_ipv6_address>]:161

In RPM-based Linux distributions, the SNMP daemon additionally requires an IPv6 mapping of SNMP community strings to security names. In such a system, the com2sec6 directive should be specified in /etc/snmp/snmpd.conf.

com2sec6 <security_name> <source> <community>

Getting a ‘Cannot query host’ error when running ‘Query Host’ Copied

The Query Host command should run properly if the appropriate Object Identifiers (OIDs) are indicated in the view directive of /etc/snmp/snmpd.conf. Ensure that you have the correct OIDs corresponding to the MIB subtrees. For more information about MIBs, see MIBs for SNMP Traps and Gets.

view <view_name> <type> <OID>

Note

Restart snmpd.

After making any changes to your SNMP configuration or after adding new MIBs, you need restart the snmpd service for the changes to take effect. Use the following command: systemctl restart snmpd. For more information about SNMP configuration, see Manpage of SNMPD.conf.

Why aren’t my interfaces being monitored? Copied

The services are only created if the Host has the ‘SNMP ’ MIB-II’ Host template applied, or has the Interface Poller, Interface, Discards and Errors Service Checks associated to the Host directly via the ‘Service Checks tab’.

I’m getting thresholds that are over 100% Copied

For each interface, Opsview will work out the utilization of an interface based on the amount of bytes transferred as reported by SNMP divided by the time difference of the two values, as a percentage of the interface speed as reported by SNMPs ifSpeed counter.

There seem to be different reasons for why you can get over 100% utilization:

The wrong ifSpeed is reported by the device. This can sometimes occur with Net-SNMP, but it is possible to set the speed correctly in the configuration file.
Some speeds are not the maximum possible throughput. ifSpeed is defined as ‘An estimate of the interface’s current bandwidth in bits per second’.
Full duplex may skew the results as you may be able to get more transfer in one direction than in another.
Some devices only update the SNMP counters at certain intervals. This means you could see sudden spikes in utilization if Opsview gathers data at different intervals.

If you have interfaces that are consistently reporting more than 100% utilization, please contact Opsview Cloud Customer Success who can assist.

Plugin raises a warning about an interface with 0 speed Copied

If you get an error like: INTERFACENAME throughput (in/out) 0 bps/0 bps but has an interface speed of 0, so cannot check a percentage threshold

When a threshold is specified as a percentage value, Opsview Cloud works out the percent utilization based on the speed. However, if the speed is zero, this is not possible.

Possible resolutions:

The device is reporting the incorrect speed - contact the device manufacturer. If the device is a Unix server running net-snmp, you can force net-snmp to set a specific speed per interface.
The interface is not valid for monitoring - uncheck the interface from being monitored.
You still want to monitor the interface status - set the threshold to a dash (which means that no threshold check will be required) or set an absolute threshold rather than a percentage, so the speed check is ignored.

There are duplicate names in the interface SNMP table which has some limitations Copied

Interfaces are tracked by their name rather than their ID as provided by the device being monitored - this is because some devices reallocate ID’s on a reboot.

Opsview tracks these interfaces by fetching each interface ‘IfDescr’ and shortening it to 52 characters and storing it as the ‘short interface name’. This limit is the standard length of interface description supported by the majority of devices. This can appear to cause duplicate interface names however, if the IfDescr contains unnecessary duplicate text, such as the following:

Nortel Ethernet Routing Switch 5510-48T Module - Unit 1 Port 1
Nortel Ethernet Routing Switch 5510-48T Module - Unit 1 Port 2
Nortel Ethernet Routing Switch 5510-48T Module - Unit 1 Port 3

These would all be shortened to Nortel Ethernet Routing Switch 5510-48T Module Un

You can either reconfigure all the interface IfDescr’s on the device to only contain short unique names such as 5510-48T Unit 1 Port 13.

And the re-running the Query Host on the Host configuration SNMP page, or set the Modify ifDescr Level option which attempts to remove certain common words. For more information, see Interfaces.

Variables tab Copied

As mentioned in the Service Checks section, Variables are covered in great detail within their relevant User Guide section. However, in essence, they act like standard computer science Variables, in that you can configure -p %PORT% instead of -p 9200 for an Elasticsearch Service Check’s arguments. The benefit of this is that by using a Variable instead of hard coding the port, you can apply the Service Check to hundreds of Hosts and simply add the %PORT% variable to the Hosts who don’t have Elasticsearch on port 9200.

In our example above, we have added the Database MySQL Host template which requires the %MYSQLCREDENTIALS% variable to be populated with relevant username/password data.

This can be configured at a global level via Configuration > Variables > %MYSQLCREDENTIALS%, which means any Host that has Service Checks/Host templates using the %MYSQLCREDENTIALS% variable will use the values set here (the global defaults), however, if a Host has a different set of credentials you can choose to add the %MYSQLCREDENTIALS% locally via the Variables tab. If the Variable is added to the Host locally, the values set here are used first.

For the Host in the screen above, you can choose to override the username/password with the custom, Host-specific ones by checking the Override username and Override password fields, respectively. The Password field has been set to an encrypted one at the Variable level, which means once the value is overridden and the Submit changes button has been pressed, the value entered cannot be retrieved only overwritten.

NetAudit tab Copied

NetAudit tab

The NetAudit tab is an optional tab present only for Users who have access to the Network Analyzer feature.

In this tab, Users can configure the settings needed in order to allow Opsview Cloud to log in to the Host and back up the network device’s configuration.

For more information, see NetAudit.

Amending Multiple Hosts (Bulk Edit) Copied

You can open the bulk edit window by pressing the edit button when more than one Host has been selected via the checkboxes:

Amend multiple hosts

A subset of the host fields will be available to be changed. Changes will be applied to all the hosts that were selected in the grid.

Bulk edit host

The fields will be pre-populated with values only if they are exactly the same for all the selected hosts.

When you submit, hosts will be updated in batches of 50 at a time - a progress bar will appear as changes are made.

The following fields can be modified:

Host tab Copied

Host Group Copied

Changes the Host Group of the selected hosts.

Monitored By Copied

Changes the monitoring cluster these hosts will be monitored by.

Note

This field will be hidden if this is a single server instance.

This field will be disabled if any of the selected hosts are of the following types:

The main orchestrator host.

A host used as a flow source.

Host Templates Copied

Changes the Host Templates for the selected hosts. See Multi Options.

Hashtags Copied

Changes the Hashtags for the selected hosts. See Multi Options.

Host Icon Copied

Changes the Host Icon for the selected hosts.

Host Description Copied

Changes the Host Description for the selected hosts.

Parents Copied

Changes the Parents for the selected host. See Multi Options.

Note
To avoid circular parent host definitions, only hosts that are not in the currently selected list of hosts can be chosen.

Host Check Command Copied

Changes the Host Check Command for the selected hosts

Check Period Copied

Changes the Check Period for the selected hosts.

Check Interval Copied

Changes the Check Period for the selected hosts.

Retry Interval Copied

Changes the Host check Retry Interval for the selected hosts.

Note
This value must be higher than the Check Interval for each host. If you set this too high, you may get a “Rollback: Error trying to synchronise object X” message.

Max Check Attempts Copied

Changes the Host check Max Check Attempts for the selected hosts.

Host Check Command Copied

Changes the Host Check Command for the selected hosts.

NetAudit tab Copied

NetAudit Password Copied

Changes the NetAudit password for the selected hosts.

Deleting a Host Copied

You can delete a host you no longer wish to monitor by clicking the host contextual menu and selecting the Delete option.

Delete a host

You will then be required to confirm the host deletion.

Deleting Multiple Hosts (Bulk Delete) Copied

This button will be enabled when hosts have been selected using the checkboxes and they can be deleted:

Delete multiple hosts

Some special hosts are not deletable, such as the Orchestrator, Collectors or Flow Sources - in these cases, these hosts will be ignored.

After confirmation, all selected hosts (minus any special hosts) will be deleted.

Multi Options Copied

All fields, bar Host Templates and Hashtags, are simply replacement actions, meaning you can choose to enter a new Host Check Command for 10 Hosts, and on submit the Host Check Commands for those Hosts is changed to the Host Check Command specified.

For Host templates and Hashtags, the actions are a little more powerful as a Host can have no Host Templates/Hashtags, or alternatively multiple Host templates/Hashtags. Therefore, in the Bulk Edit mechanism for these two fields there are the following options:

Add to existing: Choose this option to append a new Hashtag/Host Template to the Hosts.
Clear field: Choose this option to remove all Hashtags/Host Templates from the Hosts.
Replace with: Choose this option to remove all Hashtags/Host Templates from the Hosts and add the selected Hashtag/Host templates instead. Essentially, this action clears the fields and then adds the selected items.
Find and remove: Choose this option to remove the selected Hashtags/Host templates from the selected Hosts. This option is, in essence, a selective delete.

Previous article Next article

Hosts

Overview Copied

Working with Hosts Copied

Single Hosts Copied

Multiple Hosts Copied

Adding or Editing a Single Host Copied

Host tab Copied

Primary Hostname/IP Copied

Host Name Copied

Host Group Copied

Monitored By Copied

Host Templates Copied

Other Hostnames/IPs Copied

Description Copied

Host Check Command Copied

Icon Copied

Check Period Copied

Check Interval Copied

Max Check Attempts Copied

Retry Interval Copied

Hashtags Copied

Globally Applied Hashtags Copied

Event Handler Copied

Always execute Copied

Parents Copied

Notifications tab Copied

Notify On Copied

Notification Period Copied

Flap Detection Copied

Service Checks tab Copied

Host variables Copied

Removing Host variables Copied

Filtering Host variables Copied

SNMP Tab Copied

Credentials Copied

SNMP v1 and v2c Copied

SNMP v3 Copied

Interfaces tab Copied

Extended Throughput Data Copied

SNMP Message Size Copied

Use SNMP GetNext Copied

Use SNMP ifName Copied

Modify ifDescr Level Copied

Discards, Errors and Throughput Thresholds Copied

Alert Type Copied

SNMP limitations Copied

Getting a ‘Cannot connect with’ error when running ‘Test SNMP Connection’ Copied

Getting a ‘Cannot query host’ error when running ‘Query Host’ Copied

Why aren’t my interfaces being monitored? Copied

I’m getting thresholds that are over 100% Copied

Plugin raises a warning about an interface with 0 speed Copied

There are duplicate names in the interface SNMP table which has some limitations Copied

Variables tab Copied

NetAudit tab Copied

Amending Multiple Hosts (Bulk Edit) Copied

Host tab Copied

Host Group Copied

Monitored By Copied

Host Templates Copied

Hashtags Copied

Host Icon Copied

Host Description Copied

Parents Copied

Host Check Command Copied

Check Period Copied

Check Interval Copied

Retry Interval Copied

Max Check Attempts Copied

Host Check Command Copied

NetAudit tab Copied

NetAudit Password Copied

Deleting a Host Copied

Deleting Multiple Hosts (Bulk Delete) Copied

Multi Options Copied

Was this topic helpful?

Your thoughts...

How can we improve this topic?

Your thoughts...

Thank you for your feedback!