AWS

Overview Copied

The AWS plugin is a Collection Agent plugin that gathers metrics through AWS CloudWatch. This plugin also provides an API Destination that can interact with AWS services, such as EventBridge and SNS. The AWS plugin improves Geneos cloud monitoring capabilities by building a more easy-to-use and scalable solution to interface with AWS CloudWatch to monitor various services being deployed in AWS.

In addition, the AWS plugin allows you to:

Monitored services, logs, and events Copied

You can use the AWS plugin to monitor different services by using the following collectors in the Collection Agent YAML file:

Deployment recommendations Copied

Launch ITRS Geneos EC2 instance in AWS Marketplace Copied

Use the Plugin in AWS Marketplace deployment option for the following reasons:

Configure Geneos to deploy the AWS plugin Copied

Use the Configure Geneos to deploy AWS CloudWatch deployment option in the following cases:

Prerequisites Copied

Geneos environment Copied

The AWS Collection Agent plugin requires the following versions of Geneos components:

The AWS binaries are packaged with Netprobe, and are stored in the collection_agent folder. Alternatively, you can download separate binaries for the AWS plugin from the ITRS Downloads.

Caution

Collection Agent and its plugins is no longer packaged with the Netprobe in Geneos 5.14.7 and the subsequent 5.x versions. If you want to run Collection Agent via Netprobe, please upgrade to the current 6.x version of Geneos.

AWS environment Copied

The AWS plugin requires valid AWS credentials to use, such as an Access Key ID and a Secret Access Key. Please refer to the Setting the default credentials page for how to specify your AWS credentials on your machine.

To see the required permissions for some of the monitored services, see Required AWS plugin permissions in Plugin services.

CloudWatch API usage Copied

Since the AWS plugin interacts with AWS CloudWatch using Amazon provided APIs, you should be aware of CloudWatch services quotas.

Otherwise, you might encounter an error similar to this: 2021-12-02 14:05:11.833 [EC2Service-Processor-0] ERROR com.itrsgroup.collection.plugins.aws.AwsCollector(awsSG) - CloudwatchMetricDataSource Get Metrics Error: Rate exceeded (Service: CloudWatch, Status Code: 400, Request ID: 31d9a57e-c6c5-46fd-94da-7fe62e74010c, Extended Request ID: null)

CloudWatch query time windows Copied

The values obtained from AWS are the values averaged over the last complete 5-minute window. This is because Cloudwatch makes the data available with a 5-minute latency.

For example, if the collection time is 2021-07-04T03:31:12.34Z, then the time window used to query the data in CloudWatch is:

In the case where the collection interval is less than 5 minutes, the AWS plugin will query CloudWatch using the adjusted time windows at first. Then for the succeeding samples where the adjusted time window is the same as the previous window, no queries will be done so the plugin generates no data.

For example, when the collectionInterval is set to 1 minute, the AWS plugin will query CloudWatch at first, then until it hits the next complete 5-minute window (for example, after 5 minutes), the plugin will not return any data.

Configure Geneos to deploy the AWS plugin Copied

The AWS plugin supports Collection Agent publication into Geneos using dynamic Managed Entities. Setting up this plugin in Geneos involves these primary steps:

  1. Set up your Collection Agent plugin.
  2. Configure your mappings.
  3. Configure your other Dynamic Entities in the Gateway, see Create Dynamic Entities in Collection Agent setup for a more detailed procedure.

Set up your Collection Agent plugin Copied

Use one of the following options listed below to configure the plugin.

Below are the available collectors for the AWS plugin:

Collectors Description
AwsCollector

Enables AWS collector configuration.

To add more AWS services to monitor, you can add them in enabledServices following this format: <service.provider>/<service-type>. Without this setting, the AWS plugin will pick up metrics for all available AWS services. See Monitored AWS Services.

AwsBillingCollector Enables AWS Billing collector configuration. See AWS/Billing.
ApiDestinationCollector Enables AWS API destination configuration to monitor the following AWS logs and events:
AwsSdkUsageMetricsCollector Publishes the aggregated SDK Usage metrics from the last 5-minute window. See AWS SDK Usage metrics collector .
AwsCustomNamespaceCollector Collects and publishes custom metrics published into custom namespaces created in AWS CloudWatch. See AWS Custom namespace collector.
AwsMetricStreamCollector Exposes an HTTPS endpoint to connect to Data Firehose and the received data is processed to provide similar datapoints as AwsCollector. See CloudWatch Metric Streaming.
collectors:
  # AWS collector configuration
  - name: aws
    type: plugin
    className: AwsCollector

    # Interval (in millis) between collections (optional, defaults to five minutes).
    collectionInterval: 300000
    
    # AWS regions from which metrics will be collected
    regions:
      - ap-southeast-1
      - ap-southeast-2
    
    # List of services to collect metrics from (optional, case-sensitive, and defaults to all metrics)
    enabledServices:
      - AWS/EC2
      - AWS/EBS
      - AWS/EKS
      - AWS/RDS
      - AWS/ECS

    # Publish SDK usage metrics
    sdkMetrics: true

    # FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
    awsUrl: "http://localhost:4566"

  # AWS custom metric collector configuration
  - name: awscustom
    type: plugin
    className: AwsCustomNamespaceCollector
    # Interval (in millis) between collections (optional, defaults to five minutes).
    collectionInterval: 300000
    
    # AWS regions from which metrics will be collected
    regions:
      - ap-southeast-1
      - ap-southeast-2

    # Publish SDK usage metrics
    sdkMetrics: true

    # FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
    awsUrl: "http://localhost:4566"

    # List of custom namespaces to collect metrics from
    customMonitoredNamespaces:
      - namespace: ECS/ContainerInsights
        # List of metrics to collect from the custom namespace (optional, defaults to collect all available metrics if not specified)
        customMonitoredMetrics:
          # Regex pattern to match the metrics to collect from the namespace
          - nameIncludes: Network*
            # Length of time used to aggregate the metric (optional, defaults to 300 seconds)
            period: 300
            # Aggregation to be used (optional, defaults to "average")
            statistic: average
            
  # AWS Billing collector configuration
  - name: aws-billing
    type: plugin
    className: AwsBillingCollector
    # Interval (in millis) between collections (optional, defaults to 6 hours).
    collectionInterval: 21600000
    
    # Publish SDK usage metrics
    sdkMetrics: true
    
  # AWS SDK Metrics collector configuration
  # Publishes the aggregated SDK Usage Metrics from the last 5-minute window
  - name: aws-sdk-metrics
    type: plugin
    className: AwsSdkUsageMetricsCollector
    # Interval (in millis) between collections (optional, defaults to five minutes).
    collectionInterval: 300000

  # AWS API destination configuration
  - name: apidestination
    type: plugin
    className: ApiDestinationCollector

    # AWS region from which alarm notifications are expected.
    snsRegion: ap-southeast-1

    # Port on which to receive API destination events
    port: ${env:API_DESTINATION_PORT}

    # Acceptor thread pool size (optional, defaults to 2)
    acceptorThreadPoolSize: 2

    # Worker thread pool size (optional, defaults to 4)
    workerThreadPoolSize: 4

    # TLS configuration (TLS is required by AWS API Destination to be a valid endpoint)
    tlsConfig:
      certFile: ${env:CERT_FILE}
      keyFile: ${env:KEY_FILE}
      trustChainFile: ${env:TRUST_CHAIN_FILE}

    # Authentication type (at the moment, only basic authentication for EventBridge is supported)
    authentication:
      # The basic authentication credentials here should match the ones set in EventBridge
      basicAuthentication:
        username: ${env:BASIC_AUTH_USERNAME}
        password: ${env:BASIC_AUTH_PASSWORD}

    # FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
    awsUrl: "http://localhost:4566"

    # FOR ENGINEERING USE ONLY
    # This option specifies whether the signatures of SNS messages will be verified first.
    # Verifying SNS message signatures ensures that the message was sent from Amazon SNS.
    # Set this option to false in order to handle SNS messages that are not from Amazon SNS.
    # (optional, defaults to true)
    verifySnsSignature: true

  # AWS Metric Stream configuration
  - name: metricstream
    type: plugin
    className: AwsMetricStreamCollector

    # TLS configuration (TLS is required by AWS Metric Stream to be a valid endpoint)
    tlsConfig:
      certFile: ${env:CERT_FILE}
      keyFile: ${env:KEY_FILE}

    # Metric stream format (optional, defaults to otel-10)
    # Options are case-insensitive: json, otel-07, and otel-10
    metricFormat: json

    # Statistics configuration for each metric (optional, default uses internal table statistics from aws-collector)
    # This configuration can overwrite or add to the default internal table of statistics
    # Options are case-insensitive: average, sum, minimum, and maximum
    # statistics:
    #   AWS/EC2:
    #     CPUUtilization: Sum

Configure your mappings Copied

To be able to show metrics and events in Geneos, dynamic mappings must be configured and attached to the Netprobe receiving the data from the Collection Agent. Use one of the following options listed below to configure your dynamic mappings.

Access AWS cloud through a proxy host Copied

Accessing AWS cloud through a proxy can be configured by adding the http.* properties in the JVM arguments. For example, you can access the cloud via a proxy host and port by adding the following properties:

-Dhttp.proxyHost=webcache.example.com -Dhttp.proxyPort=8080

For more information on adding JVM arguments, see Managed Collection Agent.

To learn more about the available properties to enable proxy access, see Java Networking and Proxies.

Dealing with large volumes of data Copied

When the AWS plugin collects and publishes a large volume of data, the Collection Agent may perform garbage collection more frequently. This can lead to performance issues.

To address this, you can increase the heap size of the Collection Agent by adding the following JVM arguments, for example:

-Xms1024m -Xmx1024m

This increases the initial heap size to 1024 MB and the maximum heap size to 1024 MB. Note that the default values are 512 MB for both the initial and maximum heap sizes.

For more information on adding JVM arguments, see Managed Collection Agent.

AWS API destination collector Copied

Aside from the AWS Plugin services, you can also use this collector to monitor the following AWS services:

You can use the sample mapping included in the Gateway package: templates/aws_mapping.xml, that contains the sample FKM streams to handle AWS Events, Logs, and Alarms services.

Note

Ensure that you enable the AWS API destination configuration in the Collection Agent YAML file before you set up these services in the AWS Management Console site.

EventBridge event and CloudWatch log streaming Copied

Amazon EventBridge, a serverless event bus service, streams real-time events, and logs from various AWS resources and applies its rules to route events to its targets, one of which is the ApiDestinationCollector collector. The ApiDestinationCollector collector exposes an HTTPS endpoint to connect to EventBridge, and the received data can then be processed as FKM streams.

Logs and events are not formatted and are displayed as it is. A source dimension is available with the following behaviour:

HTTPS and authentication Copied

For the ApiDestinationCollector to be considered a valid endpoint by AWS, it needs to use the HTTPS protocol. This can be achieved by using the TLS configuration of the ApiDestinationCollector. Using self-signed TLS certificates will not work. Events sent by AWS can only be read by the collector if certificates from a trusted CA (certificate authority) are used.

For EventBridge events, the ApiDestinationCollector endpoint requires basic authentication. An event can only reach the API destination endpoint if it has the proper basic authentication credentials.

Set up EventBridge event bus events Copied

To receive events from EventBridge, create an FKM stream in your Gateway then set up the following in your AWS Management Console. If you need more information about EventBrigde, see Amazon EventBridge.

  1. Navigate to Amazon EventBridge > Event buses > Create an event bus where you will stream the logs or events.
  2. In the Integration > API destinations > Connections, click Create connection.
    • In the Authorization type, select Basic (Username/Password) to input your desire username and password.
    • Authorization for EventBridge is required, but at the moment only basic authentication is supported by the plugin.
  3. In the Integration > API destinations > API destinations, click Create API destination.
    • In the API destination endpoint, enter https://<url-where-aws-plugin-is-hosted>:<port-defined-in-api-destination-collector-config>.
    • In the HTTP method, choose POST.
  4. Navigate to Amazon EventBridge > Rules to create a rule.
    • In the Define pattern, select the type of event that the rule will apply to.
    • For lambda functions, the source is lambda and detail-type is either Lambda Function Invocation Result - Success or Lambda Function Invocation Result - Failure
    • In the Target, select API destination. Select the API destination that you created.

API destination Events FKM dataview Copied

Set up CloudWatch Logs Copied

CloudWatch logs can be sent to the ApiDestinationCollector collector through a subscription filter to a Lambda function, which then passes the logs to EventBridge. Similar to events, data is then sent from the EventBridge event bus to the collector.

Prerequisites Copied

To receive CloudWatch logs, create an FKM stream in your Gateway then set up the following in AWS Management Console. If you need more information about AWS CloudWatch, see CloudWatch.

  1. Navigate to CloudWatch > Log groups to select the AWS resource log group that you want to stream.
  2. In the Subscription filters, click Create > Create Lambda subscription filter.
    • In the Choose destination, select the lambda function you have set up as prerequisite.
    • In the Configure log format and filters, choose a filter pattern to match the logs that you want to stream.
    • Click Start streaming.
  3. Navigate to Lambda > Functions.
  4. Select the lambda function you have set up as prerequisite, then verify that the subscription filter is under Configuration > Triggers and that the correct EventBridge event bus is set as the destination under Configuration > Destinations.

API destination Logs FKM dataview Copied

Alarm Notifications Copied

The ApiDestinationCollector collector can handle alarm notifications from SNS messages that are sent from standard topics. These alarm notifications are then published to the Collection Agent as log events.

Note

The alarm notifications only from the snsRegion specified in the ApiDestinationCollector collector configuration will be handled.

The following describes some of the parameters of the log events related to Alarm notifications:

Parameter Description
name

Name of the alarm that triggered the notification.

This corresponds to the stream name of the FKM plugin in the Netprobe.

message

Shows the details regarding the alarm that triggered the notification. This corresponds to the triggerDetails column of the FKM plugin in the Netprobe.

Format: TIMESTAMP [Namespace=NAMESPACE] [Metric=METRIC_NAME] State=STATE Reason=REASON [Dimensions=DIMENSIONS]

  • TIMESTAMP - timestamp of the state update. This timestamp is in UTC.
  • NAMESPACE - namespace of the metric associated with the alarm.
  • METRIC_NAME - name of the metric associated with the alarm.
  • STATE - state value for the alarm. Possible values: ALARM, OK, or INSUFFICIENT_DATA.
  • REASON- explanation for the alarm state. For metric alarms, the reason contains the value and timestamp of the datapoints that caused the alarm state update. For composite alarms, the reason may contain the metric alarm that caused the state update of the composite alarm.
  • DIMENSIONS- dimensions for the metric associated with the alarm.
For composite alarms, there are no Namespace, Metric, and Dimensions fields.
Sample message for Metric alarm: 2021-12-03T06:38:23.047Z Namespace=AWS/EC2 Metric=CPUUtilization State=ALARM Reason=“Threshold Crossed: 6 out of the last 30 datapoints were less than or equal to the threshold (0.7). The most recent datapoints which crossed the threshold: [0.0650449497620362 (03/12/21 06:31:00), 0.0327868852458998 (03/12/21 06:26:00), 0.0666666666666628 (03/12/21 06:21:00), 0.09947207557655299 (03/12/21 06:16:00), 0.0333333333333314 (03/12/21 06:11:00)] (minimum 30 datapoints for OK -> ALARM transition).” Dimensions=[{“value”:“i-07d7e8d7dd0ee675f”, “name”:“InstanceId”}]
Sample message for Composite alarm: s2021-12-03T08:18:20.602Z State=OK Reason=“arn:aws:cloudwatch:ap-southeast-2: 164181677543:alarm:alarm_when_greaterthan_0.07 transitioned to INSUFFICIENT_DATA at Friday 03 December, 2021 08:18:20 UTC”
entity dimensions Entity dimensions parameters:
  • namespace - set to AWS/Alarms.
  • region - set to the value of the snsRegion option.
  • source - name of the SNS topic where the alarm notification came from.
  • alarm_name - name of the alarm that triggered the notification.

Set up for Alarm Notifications Copied

To receive SNS messages related to alarm notifications, create an FKM stream in your Gateway then set up the following in AWS Management Console. If you need more information about Alarm notifications, see Amazon SNS and Amazon CloudWatch alarms.

  1. Navigate to Amazon SNS > Topic to create an HTTPS subscription under the SNS topic that receives the alarm notifications that will be collected.
  2. In the Create subscription > Details, set the Endpoint of this subscription to the URL corresponding to the server started by the collector.
  3. Keep the Enable raw message delivery unchecked. The ApiDestinationCollector will not handle SNS messages where raw message delivery is enabled. For more information, see Subscribing to an Amazon SNS topic.

Once an HTTPS subscription is confirmed, the collector will be able to receive alarm notifications from this subscription.

API destination Alarms FKM dataview Copied

AWS SDK Usage metrics collector Copied

Since the AWS plugin interacts with AWS CloudWatch using Amazon provided APIs, take note of the CloudWatch service quotas.

To monitor the overall SDK Usage of the AWS plugin, follow these steps:

  1. Enable the sdkMetrics configuration under the AwsCollector and the AwsBillingCollector. See the collection-agent.yml in Configure Geneos to deploy the AWS plugin.
  2. Configure the AwsSdkUsageMetricsCollector to publish the SDK Usage metrics under the namespace AWS/SdkUsage.

The AwsSdkUsageMetricsCollector reports the SDK Usage from the last 5-minute window. Below is an example dataview for AWS SDK Usage metrics:

For more information, see AWS/SDK Usage in Plugin services.

AWS Custom namespace collector Copied

The AWS plugin can collect custom metrics for custom namespaces created in AWS CloudWatch.

To monitor the custom namespace for the AWS plugin, follow these steps:

  1. Add a AwsCustomNamespaceCollector class name in your collection-agent.yml file. For more information, see Configure Geneos to deploy the AWS plugin.
  2. Enable the sdkMetrics configuration under the AwsCustomNamespaceCollector.
  3. Add and configure your custom namespace and the metrics to be collected under the AwsCustomNamespaceCollector class.
  4. Set up and configure your custom mappings and other Dynamic Entities. Follow the steps in Create Dynamic Entities in Collection Agent setup.

Below is an example dataview for the following custom namespace configuration.

Sample configuration:


# AWS custom metric collector configuration
- name: awscustom
  type: plugin
  className: AwsCustomNamespaceCollector
  # Interval (in millis) between collections (optional, defaults to five minutes).
  collectionInterval: 300000
  # AWS regions from which metrics will be collected.
  regions:
   - eu-west-1

  # Publish SDK usage metrics
  sdkMetrics: true

  # List of custom namespaces to collect metrics from
  customMonitoredNamespaces:
    - namespace: ECS/ContainerInsights
      # List of metrics to collect from the custom namespace (optional, defaults to collect all available metrics if not specified)
      customMonitoredMetrics:
      # Regex pattern to match the metrics to collect from the namespace
      - nameIncludes: Network*
      - nameIncludes: Storage*
        # Length of time used to aggregate the metric (optional, defaults to 300 seconds)
 period: 300
 # Aggregation to be used (optional, defaults to "average")
 statistic: average

Sample dataview:

CloudWatch Metric Streaming Copied

Amazon CloudWatch Metric Streams, together with Amazon Data Firehose, can send real-time metrics from various AWS resources to its configured HTTPS destination endpoint where the AwsMetricStreamCollector collector is located. The collector exposes an HTTPS endpoint to connect to Data Firehose, and the received data is processed to provide similar datapoints as AwsCollector.

Note

Failed communication between AWS Data Firehose and AwsMetricStreamCollector results in logs being stored in S3 buckets, leading to increased storage usage.

HTTPS and Authentication Copied

For the AwsMetricStreamCollector to be considered a valid endpoint by AWS, it needs to use the HTTPS protocol. This can be achieved by using the TLS configuration of the AwsMetricStreamCollector.

Note

Using self-signed or free TLS certificates, like Let’s Encrypt, will not work. Metrics sent by AWS can only be read by the collector if certificates from a trusted CA (certificate authority) are used.

Setup for CloudWatch Data Firehouse Copied

  1. Select the source and destination:
    • Source — Direct PUT
    • Destination — HTTP Endpoint
  2. Transform records — Not supported under the current revision
  3. Destination settings:
    • HTTP endpoint URL — Endpoint of the collector
    • Authentication/Access key — Not supported under the current revision
    • Content encoding — GZIP is not supported under the current revision
  4. Buffer hints — Lower buffer size and interval values will notify firehose to send metrics more frequently.
  5. Advanced settings
    • Server-side encryption — Not supported under the current revision

Setup for CloudWatch Metric Streams Copied

CloudWatch metrics can be sent to the AwsMetricStreamCollector collector through an AWS Data Firehose stream.

  1. Destination:
    • Custom setup with Firehose
    • Select your Amazon Data Firehose stream — Select the created Data Firehose
  2. Change output format — Ensure this configuration matches with collector configuration.
  3. Metrics to be streamed:
    • Select either All metrics or Select metrics. For Select Metrics, select individual metrics to be included or excluded.
    • Add additional statistics — Only default statistics are currently supported (Minimum, Maximum, Sample Count, and Sum).
["Geneos"] ["User Guide"]

Was this topic helpful?