Configuration

Required configuration Copied

Set up a new user-provided certificate Copied

When setting up a new user-provided certificate, verify that the infra-agent user has the required permissions to access the certificate file. This ensures the correct operation of the infrastructure-agent service.

Example:

chown infra-agent:infra-agent /path/to/server_key /path/to/ca_cert
chmod 440 /path/to/server_key /path/to/ca_cert

For more information about setting up a certificate, see Transport Layer Security.

Configure custom plugins Copied

Access for the infra-agent user is also required when adding a custom plugin. Ensure that the newly added custom plugin has the correct permissions.

Example:

chown :infra-agent check_something
chmod g+rx check_something

Most configuration is provided by default in the agent.default.yml file; however, the following required configuration options must be set in a custom configuration file prior to running the agent.

Value Type Default Description Example configuration
commands:
  command:
    path:
string Default plugins Location of the plugin or executable to be run when command_name is requested via check_nrpe. If the plugin is to be executed with arguments, then one or more $ARGx$ argument substitution strings can be specified, where $ARG1$ is replaced at execution time with the first argument passed to check_nrpe, $ARG2$ with the second argument, and so on. Note that just $ARG1$ can be used if you wish to pass _all_ arguments via check_nrpe.
commands:
  checkcpu:
    path: /path/to/checkcpu -w $ARG1$ -c $ARG2$
server:
  allowed_hosts:
list of strings N/A List of addresses of the clients that are allowed to use the Agent. An empty list will allow any client to connect. If check_client_cert is enabled, then only hostnames will be validated (not IP addresses).
allowed_hosts: ['1.2.3.4', '5.6.7.8']
server:
  tls:
    ca_cert:
string N/A Path to the CA certificate.
server:
  tls:
    ca_cert: /path/to/ca_cert
    ca_path: /path/to/ca_path
    cert_file: /path/to/server_key
    check_client_cert: true
    key_file: /path/to/server_cert
  tls_enabled: true
server:
  tls:
    ca_path:
string N/A Path to the CA directory. See above.
server:
  tls:
    cert_file:
string N/A Path to the server certificate. See above.
server:
  tls:
    check_client_cert:
boolean false Whether a client certificate is required, and if so, then it has to be specified in the allowed hosts. See above.
server:
  tls:
    key_file:
string N/A Path to the server key. See above.
server:
  tls_enabled:
boolean true Whether the Agent uses TLS for communications. If tls_enabled is set to false, the above TLS configurations are not required. Setting tls_enabled to false is not recommended as communications will not be secure. See above.

Basic configuration Copied

The following basic configuration options can be overridden in a custom config file.

Value Type Default Description Example configuration
commands:
  command:
    cache_manager:
boolean false Whether or not command uses the Cache Manager to save temporary state information.
checkcpu:
  cache_manager: true
  long_running_key: $PATH$
  path: /path/to/plugin
commands:
  command:
    long_running_key
$PATH$ $NAME$ custom-key $PATH$

To reduce resource spikes due to startup, a long-running process can be used to maintain the specified plugin as a long-running process that is communicated with via STDIN and STDOUT. The key is used to identify an instance of long-running process.

Values:

$PATH$ is the shortcut using the path as the key (minus any arguments), $NAME$ is the shortcut using the command name as the key, and custom-key is the process key used directly.

commands:
  checkcpu:
    cache_manager: true
    long_running_key: $PATH$
    path: /path/to/plugin
execution:
  execution_timeout:
integer 60 Maximum time that a service check is allowed to run in seconds. If this time is reached, then the service check is terminated and an error response is sent to the client.
execution:
  execution_timeout: 60
logging:
  handlers:
    syslog:
      facility:
string local6 Syslog storage location for accepting log messages. This applies to the Linux Agent only.
logging:
  handlers:
    file:
      filename: /path/to/agent.log
  loggers:
    agent:
      level: INFO
    cache:
      level: INFO
    main:
      level: INFO
    nrpe:
      level: INFO
The DEBUG logging will produce verbose logs.
logging:
  handlers:
    file:
      filename:
string N/A Path to logging file. See above.
logging:
  loggers:
    agent:
      level:
ERROR, WARNING, INFO, DEBUG INFO Logging level for the main Agent process. See above.
logging:
  loggers:
    cache:
      level:
ERROR, WARNING, INFO, DEBUG INFO Logging level for cache manager. See above.
logging:
  loggers:
    main:
      level:
ERROR, WARNING, INFO, DEBUG INFO Logging level for the Agent launcher. See above.
logging:
  loggers:
    nrpe:
      level:
ERROR, WARNING, INFO, DEBUG INFO Logging level for the nrpe server. See above.
server:
  bind_address:
string 0.0.0.0 Address on which the NRPE server listens.
server:
  bind_address: 0.0.0.0
  port: 5666
server:
  port:
integer 5666 Number of the port on which the agent listens for NRPE requests. See above.
windows_runtimes:
  runtime: /path/to/runtime
string N/A Path to the plugin runtime.
windows_runtimes:
  python: C:/Path/To/Python.exe

Windows and Linux plugins Copied

In order to understand the functionality of any plugins within the agent, pass -h or help to the specific check to print a description of different parameters you can use and learn what the check does.

check_nrpe -H '<host-IP>' -c <checkname> -a 'help'

or:

check_nrpe -H '<host-IP>' -c <checkname> -a '-h'

Advanced configuration Copied

The following configuration options are for advanced users and can be overridden in a custom config file.

Value Type Default Description Example configuration
cachemanager:
  host:
string 127.0.0.1 IP that the Cache Manager is listening on, which defaults to localhost.
cachemanager:
  host: 127.0.0.1
  housekeeping_interval: 60
  max_cache_size: 1GB
  max_item_size: 0
  port: 8184
  timestamp_error_margin: 30
cachemanager:
  housekeeping_interval:
integer 60 Determines how often to purge cache manager items that have expired in seconds. See above.
cachemanager:
  max_cache_size:
string 1GB Total maximum size of the cache manager cache, for example 500KB, 1GB, and 2MB. See above.
cachemanager:
  max_item_size:
integer 0 Largest size for any single item in the cache. 0 means there is no limit (subject to max_cache_size). See above.
cachemanager:
  port:
integer 8184 Port number the cache manager listens on. See above.
cachemanager:
  timestamp_error_margin:
integer N/A Timestamps are required to prevent replay attacks. It is recommended that you make allowances for some delays in the system, and the timestamp_error_margin is one factor is to account for that. N/A
poller_schedule:
  command name:
integer N/A List of commands that use the poller and the associated sampling interval in seconds. Use the plugin name to look up configured plugins to see if there are any default args that need applying.
poller_schedule:
  checkcpu: 10
process_recycle_time:
integer 86400 Time after which the long-running processes are automatically recycled (restarted), so as to avoid potential memory leaks.
process_recycle_time: 86400
server:
  allow_multi_packet_response:
boolean true Whether the Agent should send multiple NRPEv2 packets back as a response when a check’s output exceed 1023 characters in length. This method of sending responses only works when used with an Opsview modified check_nrpe client. If an unmodified NRPE client is in use, then this flag should be disabled.
server:
  allow_multi_packet_response: true
  housekeeping_interval: 300
  max_active_connections: 15
  max_queued_connections: 30
  max_request_time: 120
  receive_data_timeout: 5
  tls:
    ca_cert: null
    ca_path: null
    cert_file: null
    check_client_cert: true
    cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS
    context_options: - NO_SSLv3 - NO_TLSv1 - NO_TLSv1_1
    key_file: null
  tls_enabled: true
  tls_handshake_timeout: 3
server:
  housekeeping_interval:
integer 300 Determines how often to flush the cache of looked-up addresses (non-TLS).
server:
  allow_multi_packet_response: true
  housekeeping_interval: 300
  max_active_connections: 15
  max_queued_connections: 30
  max_request_time: 120
  receive_data_timeout: 5
  tls:
    ca_cert: null
    ca_path: null
    cert_file: null
    check_client_cert: true
    cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS
    context_options: - NO_SSLv3 - NO_TLSv1 - NO_TLSv1_1
    key_file: null
  tls_enabled: true
  tls_handshake_timeout: 3
server:
  max_queued_connections:
integer 30 Maximum number of connections that may be queued waiting to be accepted by the NRPE server.
server:
  max_active_connections:
integer 15 Maximum number of connections that may be handled concurrently by the NRPE server. This is effectively the number of commands that can be run in parallel. See above.
server:
  max_request_time:
integer 120 Maximum time the NRPE server waits to acquire the lock that allows it to process a request. Up to max_connections may hold the lock at any one time. If the server reaches max_request_time while waiting for the lock, then it should start terminating requests that are in progress, as they have either overrun, or become stuck. However, this functionality has yet to be implemented. This should probably be set to something similar to execution.timeout but there is no advantage in having it set for much longer than that. See above.
server:
  receive_data_timeout:
integer 5 Maximum time in seconds that the NRPE server will wait for data to arrive after the connection has been established. This is needed to mitigate a class of DoS attacks, where the client establishes a TLS connection and keeps it open but sends no data. See above.
server:
  tls_handshake_timeout:
integer 3 LS handshake timeout in seconds. See above.
server:
  tls:
    cipher_suite:
string Cipher used to secure a network connection. The default value is ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS See above.
server:
  tls:
    context_options:
list of strings NO_SSLv3 NO_TLSv1 * NO_TLSv1_1 Advanced TLS contextual options. See above.

Example configurations Copied

Linux Copied

---
# Example Linux configuration file

cachemanager:
  host: 127.0.0.1
  housekeeping_interval: 60
  max_cache_size: 1GB
  max_item_size: 0
  port: 8184
  timestamp_error_margin: 30
commands:
  check_cpu_stats:
    cache_manager: false
    path: /opt/itrs/infrastructure-agent/plugins/check_cpu_stats $ARG1$
  check_memory:
    cache_manager: false
    path: /opt/itrs/infrastructure-agent/plugins/check_memory $ARG1$
execution:
  execution_timeout: 60
logging:
  handlers:
    syslog:
      facility: local6
  loggers:
    agent:
      level: INFO
    cache:
      level: INFO
    main:
      level: INFO
    nrpe:
      level: INFO
poller_schedule: {}
process_recycle_time: 86400
server:
  allowed_hosts: null
  allow_multi_packet_response: true
  bind_address: 0.0.0.0
  housekeeping_interval: 300
  max_active_connections: 15
  max_queued_connections: 30
  max_request_time: 120
  port: 5666
  receive_data_timeout: 5
  tls:
    ca_cert: /path/to/ca_cert
    ca_path: /path/to/ca_directory
    cert_file: /path/to/server_cert
    check_client_cert: true
    cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS
    context_options:
    - NO_SSLv3
    - NO_TLSv1
    - NO_TLSv1_1
    key_file: path/to/server_key
  tls_enabled: true
  tls_handshake_timeout: 3

Windows Copied

---
# Example Windows configuration file

cachemanager:
  host: 127.0.0.1
  housekeeping_interval: 60
  max_cache_size: 1GB
  max_item_size: 0
  port: 8184
  timestamp_error_margin: 30
commands:
  checkcpu:
    long_running_key: $PATH$
    path: C:/Program\ Files/Infrastructure\ Agent/plugins/check_windows.exe check_cpu_load $ARG1$
  checkdrivesize:
    long_running_key: $PATH$
    path: C:/Program\ Files/Infrastructure\ Agent/plugins/check_windows.exe check_drivesize $ARG1$
execution:
  execution_timeout: 60
logging:
  handlers:
    file:
      filename: C:/Program\ Files/Infrastructure\ Agent/logs/agent.log
  loggers:
    agent:
      level: INFO
    cache:
      level: INFO
    main:
      level: INFO
    nrpe:
      level: INFO
poller_schedule:
  checkcpu: 10
process_recycle_time: 86400
server:
  allowed_hosts: 
    - myallowedhost.com
    - 10.1.2.3
  allow_multi_packet_response: true
  bind_address: 0.0.0.0
  housekeeping_interval: 300
  max_active_connections: 15
  max_queued_connections: 30
  max_request_time: 120
  port: 5666
  receive_data_timeout: 5
  tls:
    ca_cert: C:/path/to/ca_cert
    ca_path: C:/path/to/ca_directory
    cert_file: C:/path/to/server_cert
    check_client_cert: true
    cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS
    context_options:
    - NO_SSLv3
    - NO_TLSv1
    - NO_TLSv1_1
    key_file: C:/path/to/server_key
  tls_enabled: true
  tls_handshake_timeout: 3
windows_runtimes: {}
["Opsview On-premises"] ["User Guide"]

Was this topic helpful?