Elasticsearch

Overview Copied

Elasticsearch monitoring is a Gateway configuration file that enables monitoring of Elasticsearch Cluster through the Toolkit plug-in.

Elasticsearch is a distributed, search, and analytics engine that is capable of scaling horizontally, allowing to add more nodes to the cluster. This means that it can search and analyze large scale of data.

The elements that make Elasticsearch work are defined as follows:

Node is a running instance of Elasticsearch that is capable of knowing the location of the document.
Cluster consists of one or more nodes with the same cluster name that can share their data and load.

Track the following key areas when using Elasticsearch monitoring:

In this Elasticsearch monitoring template, you will see these metrics in your dataview:

Cluster health
Indexing performance
Search performance
Node and resource information
Thread pool

Intended audience Copied

This guide is intended for users who are setting up, configuring, troubleshooting and maintaining this integration. This is also intended for users who will be using Active Console to monitor data from Elasticsearch. Once the integration is set up, the samplers providing the dataviews become available to that Gateway.

As a user, you should be familiar with Linux or any other operating system, and with the administration of the Elasticsearch services.

Prerequisites Copied

The following requirements must be met before the installation and setup of the template:

A machine running the Netprobe must have access to the host where the Elasticsearch instance is installed and the port Elasticsearch is listening to.
Netprobe 4.6 or higher.
Gateway 4.8 or higher.
Python 2.7 or 3.6 installation on the machine where the Netprobe resides.
Elasticsearch 6.1.2.

Key Area	Description
Search performance	Determine how the search function perform over time by monitoring the query operations, load or latency, field data cache and evictions.
Indexing performance	Each shard in the index can be updated through flush and refresh process. Shard is a container for data that can be either a primary or a replica shard. It is how the Elasticsearch distributes data in the clusters. Index refresh - creates a new in-memory segment allowing the newly indexed documents searchable. Index flush - new documents are added to the in-memory buffer, the segments are committed, and the transaction log is cleared.
Cluster health and node availability	Monitors the current state of all clusters and nodes.
Resource utilisation	Provides information on how the thread pool queues and rejection works in monitoring the bulk, index, merge, and operations.
System and network metrics	Shows information about every node in the cluster, resource and memory usage, and active connections opened over time.

Installation procedure Copied

Ensure that you have read and can follow the system requirements prior to installation and setup of this integration template.

Download the integration package geneos-integration-elasticsearch-<version>.zip from the ITRS Downloads site.
Open Gateway Setup Editor.
In the Navigation panel, click Includes to create a new file.
Enter the location of the file to include in the Location field. In this example, it is the include/ElasticsearchMonitoring.xml.
Update the Priority field. This can be any value except 1. If you input a priority of 1, the Gateway Setup Editor returns an error.
Expand the file location in the Include section.
Select Click to load.
Click Yes to load the new Elasticsearch include file.
Click Managed entities in the Navigation panel.
Add the Elasticsearch type to the Managed Entity section that you will use to monitor Elasticsearch.
ClickValidate current document to check your configuration.
ClickSave current document to apply the changes.

Set up the samplers Copied

These are the pre-configured samplers available to use in include/ElasticsearchMonitoring.xml.

Configure the required fields by referring to the table below:

Samplers
Elasticsearch-ClusterHealth
Elasticsearch-ThreadPool
Elasticsearch-Resource
Elasticsearch-NodeInfo
Elasticsearch-SearchPerf-ByIndex
Elasticsearch-SearchPerf-ByNode
Elasticsearch-IndexingPerf-ByIndex
Elasticsearch-IndexingPerf-ByNode

Set up the variables Copied

The include/ElasticsearchMonitoring.xml template provides the following variables that are set in the Environments section:

Variable	Description
ELASTICSEARCHMON_GROUP	Sampler group name. Default: Elasticsearch-Monitoring
ELASTICSEARCHMON_HOST	IP/Hostname of the Elasticsearch Node. Default: localhost
ELASTICSEARCHMON_PORT	Port assigned to the Elasticsearch HTTP service . Default: 9200
ELASTICSEARCHMON_PYTHON_EXE	Name of the executable script that calls the python code.

Set up the rules Copied

The ElasticsearchMonitoring-SampleRules.xml template also provides a separate sample rules that you can use to configure the Gateway Setup Editor.

Your configuration rules must be set in the Includes section. In the Navigation panel, click Rules.

The table below shows the included rule setup in the configuration file:

Rules	Sample Rules
Resource	Elasticsearch-Diskspace
	Elasticsearch-FileDesc
	Elasticsearch-Cpu
ClusterHealth	Elasticsearch-ClusterStatus
Indexing	Elasticsearch-IndexingLatency
	Elasticsearch-RefreshLatency
	Elasticsearch-FlushLatency
Search	Elasticsearch-QueryLatency
Search	Elasticsearch-FetchLatency

Metrics and dataviews Copied

Elasticsearch cluster health Copied

This monitors the overall health of the cluster by indicating how it is functioning:

Column Name	Description
cluster	Name of the cluster.
status	Health status of the cluster: Green - all primary and replica shards are active. Yellow - indicates that at least one replica shard is not properly allocated or missing. Red - indicates that at least one primary shard is missing that can cause data loss.
nodeTotal	Total number of nodes in the cluster.
nodeData	Total number of nodes in the cluster that can store data.
shardsTotal	Total number of shards.
shardsInitializing	Number of initialising nodes.
shardsUnassigned	Number of unassigned shards.

Elasticsearch indexingPerf-ByIndex Copied

This dataview monitors indexing performance by index. Data is grouped per index:

Column Name	Description
index	Name of the index.
indexingIndexTotal	Total number of indexing operations.
indexingIndexTime	Time spent in indexing. Unit: millisecond (ms)
indexingIndexCurrent	Number of current indexing operations.
refreshTotal	Total number of refreshes.
refreshTime	Time spent in refresh operations. Unit: millisecond (ms)
flushTotal	Total number of flushes.
flushTotalTime	Time spent in flushes. Unit: millisecond (ms)
averageIndexingLatency	Average time spent in indexing. This is computed from indexingIndexTime / indexingIndexTotal. Unit: millisecond (ms) per indexing operation
averageRefreshLatency	Average time spent in refresh operations. This is computed from refreshTime / refreshTotal. Unit: millisecond (ms) per refresh
averageFlushLatency	Average time spent in flush operations. This is computed from flushTotalTime / flushTotal. Unit: millisecond (ms) per flush

Elasticsearch indexingPerfp-ByNode Copied

This monitors indexing performance by node. Data is grouped per node:

Column Name	Description
nodeID	Unique node ID.
name	Name of the node.
indexingIndexTotal	Total number of indexing operations.
indexingIndexTime	Time spent in indexing. Default: millisecond (ms)
indexingIndexCurrent	Number of current indexing operations.
refreshTotal	Total number of refreshes.
refreshTime	Time spent in refresh operations. Unit: millisecond (ms)
flushTotal	Total number of flushes.
flushTotalTime	Time spent in flushes. Unit: millisecond (ms)
averageIndexingLatency	Average time spent in indexing. This is computed from indexingIndexTime / indexingIndexTotal. Unit: millisecond (ms) per indexing operation
averageRefreshLatency	Average time spent in refresh operations. This is computed from refreshTime / refreshTotal. Unit: millisecond (ms) per refresh
averageFlushLatency	Average time spent in flush operations. This is computed from flushTotalTime / flushTotal. Unit: millisecond (ms) per flush

Elasticsearch nodeInfo Copied

This displays information about the nodes in the cluster:

Column Name	Description
nodeID	Unique node ID.
name	Name of the node.
IP	IP address.
port	Bound transport port.
http	Bound http address and port.
version	Elasticsearch version.
build	Elasticsearch build hash.
jdk	JDK version.
nodeRole	Role of the node. This can have more than one value: m - master eligible node. d - data note. i - ingest node.
master	Current master node in the cluster: * (asterisk) - current master. - (hyphen) - non-master.

Elasticsearch resource Copied

This monitors the resources of each node in the cluster:

Column Name	Description
nodeID	Unique node ID.
name	Name of the node.
cpu	CPU usage in percentage (%).
heapCurrent	Current heap usage. Unit: bytes
heapPercent	Percent used heap.
ramCurrent	Current RAM usage. Unit: bytes
ramPercent	Percent RAM used.
diskUsed	Used disk space. Unit: bytes
diskAvail	Available disk space.
diskUsedPercent	Percent disk used.
fileDescriptorCurrent	Number of used file descriptors.
fileDescriptorPercent	Percent file descriptors used.

Elasticsearch SearchPerf-ByIndex Copied

This monitors search performance by index. Data is grouped per index:

Column Name	Description
index	Name of the index.
searchQueryTotal	Number of query phase operations.
searchQueryTime	Time spent in query phase. Default: millisecond (ms)
searchQueryCurrent	Number of current query phase operations.
searchFetchTotal	Number of fetch phase operations.
searchFetchTime	Time spent in fetch phase. Default: millisecond (ms)
searchFetchCurrent	Number of current fetch phase operations.
fielddataMemory	Used fielddata cache.
fielddataEvictions	Used fielddata evictions.
averageQueryLatency	Average time spent in query phase that is computed from searchQueryTime/searchQueryTotal. Default: millisecond (ms) per query
averageFetchLatency	Average time spent in fetch phase that is computed from searchFetchTime/searchFetchTotal. Default: millisecond (ms) per fetch

Elasticsearch searchPerf-ByNode Copied

This monitors search performance by node. Data is grouped per node:

Column Name	Description
nodeID	Unique node ID.
name	Name assigned to the node.
searchQueryTotal	Number of query phase operations.
searchQueryTime	Time spent in query phase. Unit: millisecond (ms)
searchQueryCurrent	Number of current query phase operations.
searchFetchTotal	Number of fetch phase operations.
searchFetchTime	Time spent in fetch phase. Unit: millisecond (ms)
searchFetchCurrent	Number of current fetch phase operations.
fielddataMemory	Used fielddata cache.
fielddataEvictions	Used fielddata evictions.
averageQueryLatency	Average time spent in query phase that is computed from searchQueryTime/searchQueryTotal. Unit: millisecond (ms) per query
averageFetchLatency	Average time spent in fetch phase that is computed from searchFetchTime/searchFetchTotal. Unit: millisecond (ms) per fetch

Elasticsearch ThreadPool Copied

This monitors the bulk, index, and search thread pools of each node in the cluster:

Column Name	Description
node_id/name	Node ID/Thread Pool Name.
node_name	Name of the node.
name	Thread Pool name.
type	Thread Pool Type.
active	Number of active threads.
queue	Number of tasks currently in queue.
rejected	Number of rejected tasks.
size	Number of threads.
queue_size	Size of the queue with pending requests that have no threads to execute.

Previous article Next article