AWS
Overview
The AWS plugin is a Collection Agent plugin that gathers metrics through AWS CloudWatch. This plugin also provides an API Destination that can interact with AWS services, such as EventBridge and SNS. The AWS plugin improves Geneos cloud monitoring capabilities by building a more easy-to-use and scalable solution to interface with AWS CloudWatch to monitor various services being deployed in AWS.
In addition, the AWS plugin allows you to:
- Combine metrics from AWS with on-premise, multi-cloud, and hybrid environments for end-to-end visibility.
- Monitor real-time alerts using rule and alert capabilities.
- Minimise cost by using a single enterprise tool for monitoring.
Monitored services, logs, and events
You can use the AWS plugin to monitor different services by using the following collectors in the Collection Agent YAML file:
AwsCollector
— for the list of collected services, see AWS Plugin services.AwsBillingCollector
— for billing services including a breakdown of estimated charges by service. See AWS/Billing in Plugin services.ApiDestinationCollector
— for receving the following real-time data through its API Destination collector for Amazon Events, Logs, and Alarms services. Logs and events that are reported to Netprobe will result in stream messages that can be monitored by the FKM plugin of Netprobe. See AWS API destination collector.
Deployment recommendations
Launch ITRS Geneos EC2 instance in AWS Marketplace
Use the Plugin in AWS Marketplace deployment option for the following reasons:
- To deploy the AWS plugin through the AWS Marketplace with built-in dynamic entity mapping to minimise Gateway configuration.
- To deploy the AWS plugin through the marketplace with minimal configuration.
Configure Geneos to deploy the AWS plugin
Use the Configure Geneos to deploy AWS CloudWatch deployment option in the following cases:
- When you require a native deployment option where you configure the Gateway and Netprobe in Geneos on your local machine.
Prerequisites
Geneos environment
The AWS Collection Agent plugin requires the following versions of Geneos components:
- Gateway and Netprobe 5.11.x or higher. The same version must be used for the GSE schema.
- Collection Agent 2.x or higher. To run a Collection Agent, see Collection Agent setup.
The AWS binaries are packaged with Netprobe, and are stored in the collection_agent
folder. Alternatively, you can download separate binaries for the AWS plugin from the ITRS Downloads.
AWS environment
The AWS plugin requires valid AWS credentials to use, such as an Access Key ID and a Secret Access Key. Please refer to the Setting the default credentials page for how to specify your AWS credentials on your machine.
To see the required permissions for some of the monitored services, see Required AWS plugin permissions in Plugin services.
CloudWatch API usage
Since the AWS plugin interacts with AWS CloudWatch using Amazon provided APIs, you should be aware of CloudWatch services quotas.
Otherwise, you might encounter an error similar to this: 2021-12-02 14:05:11.833 [EC2Service-Processor-0] ERROR com.itrsgroup.collection.plugins.aws.AwsCollector(awsSG) - CloudwatchMetricDataSource Get Metrics Error: Rate exceeded (Service: CloudWatch, Status Code: 400, Request ID: 31d9a57e-c6c5-46fd-94da-7fe62e74010c, Extended Request ID: null)
CloudWatch query time windows
The values obtained from AWS are the values averaged over the last complete 5-minute window. This is because Cloudwatch makes the data available with a 5-minute latency.
For example, if the collection time is 2021-07-04T03:31:12.34Z
, then the time window used to query the data in CloudWatch is:
StartTime
:2021-07-04T03:25:00.00Z
EndTime
:2021-07-04T03:30:00.00Z
In the case where the collection interval is less than 5
minutes, the AWS plugin will query CloudWatch using the adjusted time windows at first. Then for the succeeding samples where the adjusted time window is the same as the previous window, no queries will be done so the plugin generates no data.
For example, when the collection-interval
is set to 1
minute, the AWS plugin will query CloudWatch at first, then until it hits the next complete 5-minute window (for example, after 5
minutes), the plugin will not return any data.
Configure Geneos to deploy the AWS plugin
The AWS plugin supports Collection Agent publication into Geneos using dynamic managed entities. Setting up this plugin in Geneos involves these primary steps:
- Configure the Collection Agent YAML file.
- Configure dynamic mappings on the Netprobe receiving the data from the Collection Agent to show the metrics and events in Geneos. Use the mappings that are included in the Gateway package:
templates/aws_mapping.xml
, which can be added as an include file in your Gateway. This mapping template also includes sample FKM streams that can be used as a template which can then be customised to handle your intended Events, Logs, and Alarms.
To set up the AWS plugin in Geneos:
- You must edit the
collection-agent.yml
file on your local machine where the binaries are stored.
Collectors | Description |
---|---|
AwsCollector
|
Enables AWS collector configuration. To add more AWS services to monitor, you can add them in |
AwsBillingCollector
|
Enables AWS Billing collector configuration. See AWS/Billing. |
ApiDestinationCollector
|
Enables AWS API destination configuration to monitor the following AWS logs and events: |
- Add
aws_mapping.xml
as an include file in your Gateway. - Click Save current document .
# Publish SDK usage metrics
sdk-metrics: true
# FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
aws-url: "http://localhost:4566"
# AWS custom metric collector configuration
- name: awscustom
type: plugin
class-name: AwsCustomNamespaceCollector
# Interval (in millis) between collections (optional, defaults to five minutes).
collection-interval: 300000
# AWS regions from which metrics will be collected
regions:
- ap-southeast-1
- ap-southeast-2
# Publish SDK usage metrics
sdk-metrics: true
# FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
aws-url: "http://localhost:4566"
# List of custom namespaces to collect metrics from
custom-monitored-namespaces:
- namespace: ECS/ContainerInsights
# List of metrics to collect from the custom namespace (optional, defaults to collect all available metrics if not specified)
custom-monitored-metrics:
# Regex pattern to match the metrics to collect from the namespace
- name-includes: Network*
# Length of time used to aggregate the metric (optional, defaults to 300 seconds)
period: 300
# Aggregation to be used (optional, defaults to "average")
statistic: average
# AWS Billing collector configuration
- name: aws-billing
type: plugin
class-name: AwsBillingCollector
# Interval (in millis) between collections (optional, defaults to 6 hours).
collection-interval: 21600000
# Publish SDK usage metrics
sdk-metrics: true
# AWS SDK Metrics collector configuration
# Publishes the aggregated SDK Usage Metrics from the last 5-minute window
- name: aws-sdk-metrics
type: plugin
class-name: AwsSdkUsageMetricsCollector
# Interval (in millis) between collections (optional, defaults to five minutes).
collection-interval: 300000
# AWS API destination configuration
- name: apidestination
type: plugin
class-name: ApiDestinationCollector
# AWS region from which alarm notifications are expected.
sns-region: ap-southeast-1
# Port on which to receive API destination events
port: ${env:API_DESTINATION_PORT}
# Acceptor thread pool size (optional, defaults to 2)
acceptor-thread-pool-size: 2
# Worker thread pool size (optional, defaults to 4)
worker-thread-pool-size: 4
# TLS configuration (TLS is required by AWS API Destination to be a valid endpoint)
tls-config:
cert-file: ${env:CERT_FILE}
key-file: ${env:KEY_FILE}
trust-chain-file: ${env:TRUST_CHAIN_FILE}
# Authentication type (at the moment, only basic authentication for EventBridge is supported)
authentication:
# The basic authentication credentials here should match the ones set in EventBridge
basic-authentication:
username: ${env:BASIC_AUTH_USERNAME}
password: ${env:BASIC_AUTH_PASSWORD}
# FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
aws-url: "http://localhost:4566"
# FOR ENGINEERING USE ONLY
# This option specifies whether the signatures of SNS messages will be verified first.
# Verifying SNS message signatures ensures that the message was sent from Amazon SNS.
# Set this option to false in order to handle SNS messages that are not from Amazon SNS.
# (optional, defaults to true)
verify-sns-signature: true
AWS API destination collector
Aside from the AWS Plugin services, you can also use this collector to monitor the following AWS services:
You can use the sample mapping included in the Gateway package: templates/aws_mapping.xml
, that contains the sample FKM streams to handle AWS Events, Logs, and Alarms services.
Note
Ensure that you enable the AWS API destination configuration in the Collection Agent YAML file before you set up these services in the AWS Management Console site.
EventBridge event and CloudWatch log streaming
Amazon EventBridge, a serverless event bus service, streams real-time events, and logs from various AWS resources and applies its rules to route events to its targets, one of which is the ApiDestinationCollector
collector. The ApiDestinationCollector
collector exposes an HTTPS endpoint to connect to EventBridge, and the received data can then be processed as FKM streams.
Logs and events are not formatted and are displayed as it is. A source dimension is available with the following behaviour:
- For logs, the value is always
lambda
since logs are passed through a lambda function. - For events, value is the source of the event. For example, for EC2 Instance State-change events, the source is
aws.ec2
.
HTTPS and authentication
For the ApiDestinationCollector
to be considered a valid endpoint by AWS, it needs to use the HTTPS protocol. This can be achieved by using the TLS configuration of the ApiDestinationCollector
. Using self-signed TLS certificates will not work. Events sent by AWS can only be read by the collector if certificates from a trusted CA (certificate authority) are used.
For EventBridge events, the ApiDestinationCollector
endpoint requires basic authentication. An event can only reach the API destination endpoint if it has the proper basic authentication credentials.
Set up EventBridge event bus events
To receive events from EventBridge, create an FKM stream in your Gateway then set up the following in your AWS Management Console. If you need more information about EventBrigde, see Amazon EventBridge.
- Navigate to Amazon EventBridge > Event buses > Create an event bus where you will stream the logs or events.
- In the Integration > API destinations > Connections, click Create connection.
- In the Authorization type, select Basic (Username/Password) to input your desire username and password.
- Authorization for EventBridge is required, but at the moment only basic authentication is supported by the plugin.
- In the Integration > API destinations > API destinations, click Create API destination.
- In the API destination endpoint, enter
https://<url-where-aws-plugin-is-hosted>:<port-defined-in-api-destination-collector-config>
. - In the HTTP method, choose POST.
- In the API destination endpoint, enter
- Navigate to Amazon EventBridge > Rules to create a rule.
- In the Define pattern, select the type of event that the rule will apply to.
- For lambda functions, the source is
lambda
and detail-type is eitherLambda Function Invocation Result - Success
orLambda Function Invocation Result - Failure
- In the Target, select API destination. Select the API destination that you created.
API destination Events FKM dataview
Set up CloudWatch Logs
CloudWatch logs can be sent to the ApiDestinationCollector
collector through a subscription filter to a Lambda function, which then passes the logs to EventBridge. Similar to events, data is then sent from the EventBridge event bus to the collector.
Prerequisites
-
An EventBridge event bus and rule to the API destination. To create an event bus, follow the steps in Set up EventBridge event bus events.
-
A lambda function.
- When creating a lambda function, make sure that the execution role used has EventBridge PutEvents permission.
- Sample code from AWS is available here. Ensure that the lambda function’s destination points to the event bus created in Set up EventBridge event bus events.
To receive CloudWatch logs, create an FKM stream in your Gateway then set up the following in AWS Management Console. If you need more information about AWS CloudWatch, see CloudWatch.
- Navigate to CloudWatch > Log groups to select the AWS resource log group that you want to stream.
- In the Subscription filters, click Create > Create Lambda subscription filter.
- In the Choose destination, select the lambda function you have set up as prerequisite.
- In the Configure log format and filters, choose a filter pattern to match the logs that you want to stream.
- Click Start streaming.
- Navigate to Lambda > Functions.
- Select the lambda function you have set up as prerequisite, then verify that the subscription filter is under Configuration > Triggers and that the correct EventBridge event bus is set as the destination under Configuration > Destinations.
API destination Logs FKM dataview
Alarm Notifications
The ApiDestinationCollector
collector can handle alarm notifications from SNS messages that are sent from standard topics. These alarm notifications are then published to the Collection Agent as log events.
Note
The alarm notifications only from thesns-region
specified in theApiDestinationCollector
collector configuration will be handled.
The following describes some of the parameters of the log events related to Alarm notifications:
Set up for Alarm Notifications
To receive SNS messages related to alarm notifications, create an FKM stream in your Gateway then set up the following in AWS Management Console. If you need more information about Alarm notifications, see Amazon SNS and Amazon CloudWatch alarms.
-
Navigate to Amazon SNS > Topic to create an HTTPS subscription under the SNS topic that receives the alarm notifications that will be collected.
-
In the Create subscription > Details, set the Endpoint of this subscription to the URL corresponding to the server started by the collector.
-
Keep the Enable raw message delivery unchecked. The
ApiDestinationCollector
will not handle SNS messages where raw message delivery is enabled. For more information, see Subscribing to an Amazon SNS topic.
Once an HTTPS subscription is confirmed, the collector will be able to receive alarm notifications from this subscription.
-
The collector should log
Successfully confirmed subscription to the topic TOPIC_NAME
, and in the AWS SNS Console, the status of the subscription should be Confirmed. -
If the collector was not running when the HTTPS subscription is created, the status of the subscription will remain as Pending confirmation. To confirm this subscription, ensure that the collector is running and then request confirmation for this subscription. To request this confirmation:
- Go to either the topic page or in the AWS SNS Console > Subscription Page.
- Select the subscription with pending confirmation, and then click the Request confirmation button.
API destination Alarms FKM dataview
Parameter | Description |
---|---|
name |
Name of the alarm that triggered the notification. This corresponds to the stream name of the FKM plugin in the Netprobe. |
message |
Shows the details regarding the alarm that triggered the notification. This corresponds to the triggerDetails column of the FKM plugin in the Netprobe. Format:
For composite alarms, there are no Sample message for Metric alarm: 2021-12-03T06:38:23.047Z Namespace=AWS/EC2 Metric=CPUUtilization State=ALARM Reason="Threshold Crossed: 6 out of the last 30 datapoints were less than or equal to the threshold (0.7). The most recent datapoints which crossed the threshold: [0.0650449497620362 (03/12/21 06:31:00), 0.0327868852458998 (03/12/21 06:26:00), 0.0666666666666628 (03/12/21 06:21:00), 0.09947207557655299 (03/12/21 06:16:00), 0.0333333333333314 (03/12/21 06:11:00)] (minimum 30 datapoints for OK -> ALARM transition)." Dimensions=[{"value":"i-07d7e8d7dd0ee675f", "name":"InstanceId"}] Sample message for Composite alarm: s2021-12-03T08:18:20.602Z State=OK Reason="arn:aws:cloudwatch:ap-southeast-2: 164181677543:alarm:alarm_when_greaterthan_0.07 transitioned to INSUFFICIENT_DATA at Friday 03 December, 2021 08:18:20 UTC" |
entity dimensions |
Entity dimensions parameters:
|