Anomaly Detection

Overview

Anomaly Detection rules are calculated using historical data. This data is used to generate a dataset and run a rule against. Based on the data you can determine what normality looks like which allows you to create rules to raise alerts when an anomaly is detected.

When creating the Anomaly Detection rules in Geneos, you use Data Sets (Time series) in their calculations. Time series allow these rules to be adaptive based on prior knowledge of the daily or weekly variations in the data the time series is applied to.

Time series model data that is read by the Gateway from a user-defined database or from the Gateway Hub. These models can then be used as part of the Anomaly Detection functionality.

In the Gateway, Anomaly Detection rules are evaluated against time series. The time series is loaded from either:

For more information on how to create Time Series, see Create a Time Series in Data Sets (Time Series).

Set up Anomaly Detection

If you are not using Gateway Hub, you need to set up the time series database. For more information, see Database driven Time Series in Data Sets (Time Series).

If you are using Gateway Hub as a source of your data:

  1. Ensure that Gateway is publishing data to Gateway Hub. For more information, see Publishing Technical Reference.
    You can also configure the Gateway Hub data plug-in to see the status of Gateway publishing to Gateway Hub. For more information, see Gateway Plug-Ins.
  2. Make sure Gateway Hub SSO configuration is enabled. For more information, see SSO OpenAPI.
  3. Start Gateway with Kerberos principal and keytab command line options. For more information, see Authenticating with Kerberos in Gateway Setup Files and Command line options in Gateway Installation Guide.
  4. Create data sets and time series in Gateway. For more information, see Create a Time Series in Data Sets (Time Series).
  5. Create rules in Gateway that use your time series data. For more information, see Rules, Actions, and Alerts.

At this point, your Anomaly Detection rules are configured. Once sufficient data has been logged, you will start receiving alerts based on the setup.

Examples

Examples of using Anomaly Detection include:

  • A moving threshold during the day to deal with spikes of traffic at known intervals.
  • Ensuring that a value stays within a safe zone during the day.

Below, you can find examples of how to define rules depending on whether your data sets are database driven or loaded from Gateway Hub.

Rules based on data sets loaded from Gateway Hub

The example below shows a rule referring to a max aggregation of a time series defined as demo5. According to the rule logic, if the value of the rule target goes above the value of the time series at that point in time, then the severity is set to critical.

if value > timeseries "demo5.max" then
 severity critical
else
 severity ok
endif

The example below shows a rule referring to a stdev aggregation of a time series defined as demo5. According to the rule logic, if the value of the rule target goes above the value of the time series at that point in time, then the severity is set to critical.

if value > timeseries "demo.stdev" then
 severity critical
else
 severity ok
endif

For more information on available statistical time series aggregations used in rules, see Type — Gateway Hub driven in Data Sets (Time Series).

Rules based on data sets loaded from the database

The example below shows a rule referring to a time series defined as maxCpu. According to the rule logic, if the value of the rule target goes above the value of the time series at that point in time, then the severity is set to critical.

if value > timeseries "maxCPU" then
  severity critical
else
  severity ok
endif

A time series is typically created using historical data pertaining to certain managed variables. In the above example, the time series maxCpu might have been created using historical data gathered on the rule target itself. Hence in effect the rule is comparing the current behaviour of the value to its historical behaviour.

A rule can typically refer to multiple time series.

if value > timeseries "cpuUpperLimit" or value < timeseries "cpuLowerLimit" then
  severity critical
else if value > timeseries "cpuUpperWarn" or value < timeseries "cpuLowerWarn" then
  severity warning
else
  severity ok
endif	

In the above example high and low thresholds for both warning and critical severity have been defined as time series. Typically such time series would have been generated by running different functions on the same historical data.

See Create a Time Series in Data Sets (Time Series) for more information.

Troubleshooting

Problem accessing the data set

If your Gateway does not have access to the data set generated by Gateway Hub, check the Gateway and Gateway Hub connectivity and any errors. This includes errors with Gateway unable to get SSO tokens from Gateway Hub. See SSO OpenAPI for more information.

Alerts are not generated

Historical data must exist for the alerts to be generated based on this data. As long as the data you are summarising in the data set has sufficient history to cover the data period selected in the setup, alerts will be generated.

Make sure you have enough historical data for the alerts to be generated.