OP5 Monitor ["OP5 Monitor"]
["Technical Reference"]

Mon command reference


You use the mon command mainly to stop and start the monitor system processes and to set up distributed or load-balanced environments.

You run the mon tool at the command line using SSH.

Caution: This command can be very destructive if not used correctly. Do not use it unless specifically instructed to do so by ITRS or the OP5 Monitor documentation.

Get help

To get a list of available sub-commands, run the command without any arguments, as follows:

# mon

You can also run sub-commands to see a list of their sub-commands, for example:

# mon query

You can get more detailed information about specific sub-command options by using option --help:

# mon query ls --help

Command reference

The table below lists some of the most useful mon commands. Most commands take two arguments, with a few exceptions.

Command Syntax Description Additional information  
start mon start Starts the monitor and merlind system processes.    
stop mon stop Stops the monitor and merlind system processes.    
restart mon restart Restarts the monitor and merlind system processes.    
ecmd search mon ecmd search <regex> Prints templates for all available commands matching the regular expression <regex>. The search is case insensitive.    
ecmd submit mon ecmd submit [options] <command_parameters> Submits a command to the monitoring engine using the supplied values. Available [options]:
log show mon log show Runs the showlog helper program. Arguments passed to this command are sent to the showlog helper.    
id generate mon id generate

Prints a UUID4 to be used when Merlin is in UUID identification mode.

For more information on using this command, see UUID identification in Scale up your monitoring environment.

node add mon node add <name> --type=[peer|poller|master] [var1=value] [varN=value] Adds a node with the designated type and variables.    
node ctrl mon node ctrl <name1> <name2> [--self] [all|--type=<peer|poller|master>] -- <command> Executes <command> on the remote nodes specified. --self — run the command on the local system as well.
--all — run the command on all configured nodes.
--type — run the command on configured nodes of the given types.
-- <command> — stop argument scanning. Everything beyond will be treated as the command to run. The first unrecognised argument marks the start of the command to be executed, but using double dashes is recommended.
node list mon node list [--type=poller,peer,master] Lists all nodes by type, if specified.    
node remove mon node remove <name1> [name2] [nameN] Removes one or more nodes from the merlin configuration.    
node show mon node show [--type=poller,peer,master] Displays all variables for all nodes, or for one node in a way suitable for use as eval $(mon node show nodename) from shell scripts and scriptlets.    
node status mon node status Shows status of all nodes configured in the running Merlin daemon. Red text indicates problem areas, such as high latency or the node being inactive, not handling any checks, or not sending regular enough program_status updates.    
oconf changed mon oconf changed Prints the last modification time among all object configuration files.    
oconf files mon oconf files Prints a list of Naemon object configuration files in alphabetical order.    
oconf hash mon oconf hash Prints an SHA-1 hash of the running configuration.    
oconf push mon oconf push Splits the configuration based on Merlin's peer and poller configuration, and sends object configuration to all peers and pollers, restarting those that receive a configuration update. SSH keys need to be set up for this to be usable without administrator supervision.    
oconf fetch mon oconf fetch --sync master

Used in poller configuration. Triggers the poller to fetch files from a configured master.

For more information on using this command, see mon oconf fetch in Scale up your monitoring environment.

oconf remote-fetch mon oconf remote-fetch [<node>|type=<peer|poller>]

Tells a specific node to fetch split configuration from the current node.

For more information on using this command, see mon oconf remote-fetch in Scale up your monitoring environment.

query ls mon query ls Lists monitored objects. You can specify filters to show only objects and statuses you are interested in.    
sshkey fetch mon sshkey fetch Fetches all the SSH keys from peers and pollers. This command is not recommended — run the pushcommand instead.    
sshkey push mon sshkey push Pushes the local SSH keys to all peers and pollers.    
check spool mon check spool [--maxage=<seconds>] [--warning=X] [--critical=X] <path> [--delete] Checks a specific spool directory for files that are older than maxage. Note that it only applies to files. It is intended to prevent a build-up of check result files and unprocessed performance data files in the various spool directories used by OP5 Monitor. It can only check one directory at a time. --warning and --critical have no effect if --delete is specified, otherwise specify threshold values.  
check cores mon check cores --warning=X --critical=X [--dir=] Checks for memory dumps resulting from segmentation violation from core parts of OP5 Monitor. Detected core files are moved to /tmp/mon-cores to keep working directories clean. --warning — default 0.
--critical — default 1, meaning any core file results in a critical alert.
--dir — for specifying another path to search for core files. This option can be used multiple times.
--delete — deletes core files not created by merlind or monitor.
check distribution mon check distribution [--no-perfdata] Checks to make sure distribution is working properly in a distributed environment. Note that it takes a few minutes to work properly after a new machine has been brought online or taken offline.    
check exectime mon check exectime [host_service] --warning=<min,max,avg> --critical=<min,max,avg> Checks execution time of active checks. [host_service] — host or service.
--warning — the warning threshold for min, max and average execution time, in seconds.
--critical — the critical threshold for min, max and average execution time, in seconds.
check latency mon check latency [host_service] --warning=<min,max,avg> --critical=<min,max,avg> Checks latency time of active checks. [host_service] — host or service.
--warning — the warning threshold for min, max and average execution time, in seconds.
--critical — the critical threshold for min, max and average execution time, in seconds.
check orphans mon check orphans Checks for checks that have not been run in a long time.    


ecmd submit

The following example adds a new comment to the PING service on host foo:

mon ecmd submit add_svc_comment service='foo;PING' persistent=1 \author='John Doe' comment='the comment'

You can also use positional arguments, provided you respect the order of command arguments, as in the following example:

mon ecmd submit add_svc_comment 'foo;PING' 1 'John Doe' 'the comment'

node ctrl

Use single quotes to execute commands with shell variables, output redirection or scriptlets, as follows:

mon node ctrl -- '(for x in 1 2 3; do echo $x; done) > /tmp/foo'