Data Sets (Time series)

Overview

Data sets represent a placeholder for different kinds of data collected by the Gateway. Time series is the only type available.

Time Series data sets can then be used as part of Anomaly Detection and Breach predictor functionality.

Create a Time Series

To create a Time Series:

  1. Select Data sets from the navigation tree in the GSE.
  2. Click New Time series.
  3. Select the type of the time series. The type specifies if the time series is database driven or gathers data from Gateway Hub.
  4. Provide all required information. See Configuration reference for more information about available options.
  5. Validate and click Save current document to save the changes.

Gateway Hub driven Time series

Geneos can use data sets generated by Gateway Hub. The specification for these data sets is set up using GSE, and the Gateway automatically manages their generation and retrieval.

For a typical configuration the following Gateway command line options should be defined:

To use this method, select Gateway Hub driven as Type when setting up time series. For more information, see Type — Gateway Hub driven.

Database driven Time Series

The time series can be set up by a process external to the Gateway and stored in two tables in the database used for database logging. The tables are:

Table name Description
time_series_user_table

Stores a set of names and unique IDs. The names are used to map the names of the time series defined in the Gateway setup to the IDs used in the time_series_data_user_table.

There are two values per row:

  • name — corresponds to the data set name defined in the Gateway and is unique.
  • time_series_id — a unique ID used in the time_series_data_user_table.
time_series_data_user_table

Stores the time series data.

There are three values per row:

  • time_series_id — ID used to link the data back to the name in the time_series_data_user_table.
  • start_time — start time for the value of a point in the time series data. Time in seconds since the start of the day.
  • value — value of the time series at start_time.

The schema for these tables is available in the Gateway resources directory provided as part of the Gateway bundle. The data is read from the database at Gateway start time and at the reload time defined in the setup.

It is up to an external process to maintain and update the time series tables. This can be controlled using the Gateway scheduled command.

To use this method, select databaseDriven as Type when setting up time series. For more information, see Configuration reference

Prerequisites

Before you can configure the database driven time series, you need to configure the database tables:

  1. Configure database logging. For more information, see MySQL configuration in Gateway Database Logging.
  2. Ensure that the tables defined in <gateway directory>/resources/database/<database type>/time-series-schema-1.0.sql exist.
  3. Insert data into those tables.

Configuration reference

Setting Description Mandatory
Time series

Time series model a day's worth of data uploaded from the database.

No

Name

Specifies a name that you want to identify each time series with. If your time series are database driven, the name must correspond to one of the database tables you have configured.

Yes

Description

Specifies additional information about the time series. You can enter multi-line comments in the description field.

No

External

Specifies how the external data access is managed.

 
External > Reload time

Specifies the time of the day that data should be uploaded from the database or Gateway Hub every day.

No (default value is the current time during time series creation).

Type

Specifies if the time series is database driven or gathers data from Gateway Hub. This setting has two options:

No

adaptive-rules0

Type — Gateway Hub driven

This section provides more information about configuration options if you select your time series to be generated from Gateway Hub.

Algorithm — Seasonal-quick

You can specify the algorithm to use to generate the dataset. There is currently one algorithm available, Seasonal-quick:

Setting Description
Entity query Specifies the entity query using the entities filter syntax. For example, user.COMPONENT=EMS. This query finds all entities with attribute COMPONENT=EMS.

For more information on how to use the entity filter syntax, see Entity Filter Syntax.

Note: In GSE you should only use quotes. Do not escape the quotes with backslashes because the JSON formatter in Gateway does that for you.

Metrics

Data you want to query. This is a sequence of raw metric names to be included in the resulting metric time series.

Example: /cpu/cpu/%userTime

For more information about retrieving metric data, see Metric Query Example

Note: In GSE you should only use quotes. Do not escape the quotes with backslashes because the JSON formatter in Gateway does that for you.

Aggregations

Aggregations are calculations you wish to perform on the data. Select from the following aggregations:

  • count
  • sum
  • min
  • max
  • stddev
  • avg
  • var
Granularity

Seasonal granularity setting is used to create Time Series buckets of selected granularity. It allows you to choose how long an interval each value in the Time Series represents.

You can choose from the drop-down menu, or enter a number followed by a length of time. The options in the drop-down menu are:

  • 1 minute
  • 5 minutes
  • 15 minutes
  • 1 hour
  • 3 hours
  • 12 hours
  • 1 day
Period

The period of cycle which defines seasonality. It allows you to choose the period over which you want the values to repeat.

The options are:

  • Day
  • Week
Periods Number of periods. It allows you to say how many recent periods (days or weeks) you want the data to be based on.
Period settings example

Here's an example of how to use period settings (granularity, period, and periods):

You want to write a rule which will compare the current value of an item with a typical value for this time of day, based on data from the last 60 days.

  • Granularity allows you to choose how long an interval each value in the time series represents. In this case, you should choose 5 minutes or 15 minutes.
  • Period allows you to choose the period over which you expect the values to repeat: does your time series represent a typical day or a typical week? In this use case, you want to choose a period of day. If the value you are monitoring varies a lot between working and non-working days, and you have enough historical data available, you should choose week.
  • Periods setting allows you to say how many recent days or weeks you want the data to be based on. In this use case, you want 60 days, so you should set this parameter to 60.

Time to live

This specifies the length of time the time series is valid for. The options are:

  • 1 day
  • 1 hour
  • 1 week
  • 3 hours

When the data is passed from Gateway to Gateway Hub, the default value is 1 day.

The data set is automatically deleted after the time to live value expires.