Geneos ["Geneos"]
["Geneos > Netprobe"]["User Guide"]

AWS Plugin services

Overview

The AWS plugin is a Collection Agent plugin that gathers metrics through AWS CloudWatch. This plugin also provides an API Destination that can interact with AWS services, such as EventBridge and SNS.

Monitored AWS services

The AWS plugin gets CloudWatch metrics from the following services. See the Required AWS plugin permissions for each service.

AWS/ApplicationELB

The AWS/ApplicationELB service collects metrics from Application Load Balancers.

Metric name Metric type Unit name Dimension Statistic Description
active_connection_count gauge   load_balancer, namespace, region sum Total number of concurrent TCP connections active from clients to the load balancer and from the load balancer to targets.
client_tls_negotiation_error_count gauge   load_balancer, namespace, region sum Number of TLS connections initiated by the client that did not establish a session with the load balancer due to a TLS error. Possible causes include a mismatch of ciphers or protocols or the client failing to verify the server certificate and closing the connection.
consumed_lcus gauge   load_balancer, namespace, region average Number of load balancer capacity units (LCU) used by your load balancer. You pay for the number of LCUs that you use per hour.
desyncmitigationmode_noncompliant_request_count gauge   load_balancer, namespace, region sum Number of requests that do not comply with RFC 7230.
dropped_invalid_header_request_count gauge   load_balancer, namespace, region average

Number of requests where the load balancer removed HTTP headers with header fields that are not valid before routing the request.

The load balancer removes these headers only if the routing.http.drop_invalid_
header_fields.enabled
attribute is set to true.

forwarded_invalid_header_request_count gauge   load_balancer, namespace, region average

Number of requests routed by the load balancer that had HTTP headers with header fields that are not valid.

The load balancer forwards requests with these headers only if the routing.http.drop_invalid_
header_fields.enabled
attribute is set to false.

grpc_request_count gauge   load_balancer, namespace, region average Number of gRPC requests processed over IPv4 and IPv6.
http_fixed_response_count gauge   load_balancer, namespace, region sum Number of fixed-response actions that were successful.
http_redirect_count gauge   load_balancer, namespace, region sum Number of redirect actions that were successful.
http_redirect_url_limit_exceeded_count gauge   load_balancer, namespace, region sum Number of redirect actions that cannot be completed.
http_code_elb_3xx_count gauge   load_balancer, namespace, region sum Number of HTTP 3XX redirection codes that originate from the load balancer. This count does not include response codes generated by targets.
http_code_elb_4xx_count gauge   load_balancer, namespace, region sum Number of HTTP 4XX redirection codes that originate from the load balancer. This count does not include response codes generated by targets.
http_code_elb_5xx_count gauge   load_balancer, namespace, region sum Number of HTTP 5XX redirection codes that originate from the load balancer. This count does not include response codes generated by targets.
http_code_elb_500_count gauge   load_balancer, namespace, region sum Number of HTTP 500 error codes that originate from the load balancer.
http_code_elb_502_count gauge   load_balancer, namespace, region sum Number of HTTP 502 error codes that originate from the load balancer.
http_code_elb_503_count gauge   load_balancer, namespace, region sum Number of HTTP 503 error codes that originate from the load balancer.
http_code_elb_504_count gauge   load_balancer, namespace, region sum Number of HTTP 504 error codes that originate from the load balancer.
ipv6_processed_bytes gauge bytes load_balancer, namespace, region sum Total number of bytes processed by the load balancer over IPv6. This count is included in ProcessedBytes.
ipv6_request_count gauge   load_balancer, namespace, region sum Number of IPv6 requests received by the load balancer.
new_connection_count gauge   load_balancer, namespace, region sum Total number of new TCP connections established from clients to the load balancer and from the load balancer to targets.
non_sticky_request_count gauge   load_balancer, namespace, region sum

Number of requests where the load balancer chose a new target because it couldn't use an existing sticky session.

For example, the request was the first request from a new client and no stickiness cookie was presented, a stickiness cookie was presented but it did not specify a target that was registered with this target group, the stickiness cookie was malformed or expired, or an internal error prevented the load balancer from reading the stickiness cookie.

processed_bytes gauge bytes load_balancer, namespace, region sum

Total number of bytes processed by the load balancer over IPv4 and IPv6.

This count includes traffic to and from clients and Lambda functions, and traffic from an Identity Provider (IdP) if user authentication is enabled.

rejected_connection_count gauge   load_balancer, namespace, region sum Number of connections that were rejected because the load balancer had reached its maximum number of connections.
request_count gauge   load_balancer, namespace, region sum

Number of requests processed over IPv4 and IPv6.

This metric is only incremented for requests where the load balancer node was able to choose a target. Requests rejected before a target is chosen (for example, HTTP 460, HTTP 400, some kinds of HTTP 503 and 500) are not reflected in this metric.

rule_evaluations gauge   load_balancer, namespace, region sum Number of rules processed by the load balancer given a request rate averaged over an hour.
healthy_host_count gauge   load_balancer, namespace, region sum Number of targets that are considered healthy.
http_code_target_2xx_count gauge   load_balancer, namespace, region sum Number of HTTP 2XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
http_code_target_3xx_count gauge   load_balancer, namespace, region sum Number of HTTP 3XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
http_code_target_4xx_count gauge   load_balancer, namespace, region sum Number of HTTP 4XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
http_code_target_5xx_count gauge   load_balancer, namespace, region sum Number of HTTP 5XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
request_count_per_target gauge   load_balancer, namespace, region sum

Average number of requests received by each target in a target group.

You must specify the target group using the TargetGroup dimension. This metric does not apply if the target is a Lambda function.

target_connection_error_count gauge   load_balancer, namespace, region sum

Number of connections that were not successfully established between the load balancer and target.

This metric does not apply if the target is a Lambda function.

target_response_time gauge   load_balancer, namespace, region average Time elapsed, in seconds, after the request leaves the load balancer until a response from the target is received. This is equivalent to the target_processing_time field in the access logs.
target_tls_negotiation_error_count gauge   load_balancer, namespace, region sum

Number of TLS connections initiated by the load balancer that did not establish a session with the target.

Possible causes include a mismatch of ciphers or protocols. This metric does not apply if the target is a Lambda function.

un_healthy_host_count gauge   load_balancer, namespace, region max Number of targets that are considered unhealthy.
lambda_internal_error gauge   load_balancer, namespace, region sum Number of requests to a Lambda function that failed because of an issue internal to the load balancer or AWS Lambda. To get the error reason codes, check the error_reason field of the access log.
lambda_target_processed_bytes gauge   load_balancer, namespace, region sum Total number of bytes processed by the load balancer for requests to and responses from a Lambda function.
lambda_user_error gauge   load_balancer, namespace, region sum Number of requests to a Lambda function that failed because of an issue with the Lambda function.
elb_auth_error gauge   load_balancer, namespace, region sum Number of user authentications that could not be completed because an authenticate action was misconfigured, the load balancer cannot establish a connection with the IdP, or the load balancer cannot complete the authentication flow due to an internal error.
elb_auth_failure gauge   load_balancer, namespace, region sum Number of user authentications that could not be completed because the IdP denied access to the user or an authorisation code was used more than once.
elb_auth_latency gauge   load_balancer, namespace, region average Time elapsed, in milliseconds, to query the IdP for the ID token and user info. If one or more of these operations fail, this is the time to failure.
elb_auth_refresh_token_success gauge   load_balancer, namespace, region sum Number of times the load balancer successfully refreshed user claims using a refresh token provided by the IdP.
elb_auth_success gauge   load_balancer, namespace, region sum Number of authenticate actions that were successful.
elb_auth_user_claims_size_exceeded gauge   load_balancer, namespace, region sum Number of times that a configured IdP returned user claims that exceeded 11K bytes in size.
state attribute   load_balancer, namespace, region   State of the load balancer.
vpc_id attribute   load_balancer, namespace, region   ID of the VPC for the load balancer.
availability_zones attribute   load_balancer, namespace, region   Subnets for the load balancer.
created_time attribute   load_balancer, namespace, region   Date and time the load balancer was created.
scheme attribute   load_balancer, namespace, region   Nodes of an Internet-facing load balancer that have public IP addresses.
ip_address_type attribute   load_balancer, namespace, region   Type of IP addresses used by the subnets for your load balancer.
dns_name attribute   load_balancer, namespace, region   Public DNS name of the load balancer.
           

AWS/AutoScaling

The AWS/AutoScaling service collects metrics from Auto-scaling groups.

Metric name Metric type Dimension Statistic Description
group_min_size gauge auto_scaling_group_name, namespace, region average Minimum size of the Auto Scaling group.
group_max_size gauge auto_scaling_group_name, namespace, region average Maximum size of the Auto Scaling group.
group_desired_capacity gauge auto_scaling_group_name, namespace, region average Number of instances that the Auto Scaling group attempts to maintain.
group_in_service_instances gauge auto_scaling_group_name, namespace, region average Number of instances that are running as part of the Auto Scaling group.
group_pending_instances gauge auto_scaling_group_name, namespace, region average Number of instances that are pending.
group_standby_instances gauge auto_scaling_group_name, namespace, region average Number of instances that are in a Standby state.
group_terminating_instances gauge auto_scaling_group_name, namespace, region average Number of instances that are in the process of terminating.
group_total_instances gauge auto_scaling_group_name, namespace, region average Total number of instances in the Auto Scaling group.
group_in_service_capacity gauge auto_scaling_group_name, namespace, region average Number of capacity units that are running as part of the Auto Scaling group.
group_pending_capacity gauge auto_scaling_group_name, namespace, region average Number of capacity units that are pending.
group_standby_capacity gauge auto_scaling_group_name, namespace, region average Number of capacity units that are in a Standby state.
group_terminating_capacity gauge auto_scaling_group_name, namespace, region average Number of capacity units that are in the process of terminating.
group_total_capacity gauge auto_scaling_group_name, namespace, region average Total number of capacity units in the Auto Scaling group.
warm_pool_warmed_capacity gauge auto_scaling_group_name, namespace, region average Amount of capacity available to enter the Auto Scaling group during scale out.
group_and_warm_pool_total_capacity gauge auto_scaling_group_name, namespace, region average Total capacity of the Auto Scaling group and the warm pool combined.
warm_pool_min_size gauge auto_scaling_group_name, namespace, region average Minimum size of the warm pool.
warm_pool_total_capacity gauge auto_scaling_group_name, namespace, region average Total capacity of the warm pool, including instances that are running, stopped, pending, or terminating.
warm_pool_desired_capacity gauge auto_scaling_group_name, namespace, region average Amount of capacity that Amazon EC2 Auto Scaling attempts to maintain in the warm pool.
warm_pool_terminating_capacity gauge auto_scaling_group_name, namespace, region average Amount of capacity in the warm pool that is in the process of terminating.
warm_pool_pending_capacity gauge auto_scaling_group_name, namespace, region average Amount of capacity in the warm pool that is pending.
group_and_warm_pool_desired_capacity gauge auto_scaling_group_name, namespace, region average Desired capacity of the Auto Scaling group and the warm pool combined.
         

AWS/Billing

The AWS/Billing service collects billing metrics, including a breakdown of estimated charges by service.

To get these metrics, you need to enable the AWS Billing collector configuration, AwsBillingCollector , in the Collection Agent YAML file. See Configure Geneos to deploy the AWS plugin in AWS.

Note: The AWS/Billing service is only available through the AwsBillingCollector and not as a service under AwsCollector.

Metric name Metric type Unit name Dimension Statistic Description
estimated_charges gauge USD currency, namespace, region, service_name average Estimated charges for your AWS usage.
budget_limit gauge USD budget_name, namespace, region average Spending limit for your budget period.
actual_spend gauge USD budget_name, namespace, region average Actual spending costs for your budget period.
forecasted_spend gauge USD budget_name, namespace, region average Forecasted spending costs for your budget period.
budget_type attribute   budget_name, namespace, region   Specifies if the budget tracks costs, usage, RI utilization, RI coverage, Savings Plans utilization, or Savings Plans coverage.
           

AWS/CertificateManager

The AWS/CertificateManager service collects metrics from available certificates in ACM.

Metric name Metric type Unit name Dimension Statistic Description
days_to_expiry gauge days certificate_arn, namespace, region minimum Number of remaining days until the certificate expires.
domain_name attribute   certificate_arn, namespace, region   Domain name defined in the certificate.
type attribute   certificate_arn, namespace, region   Certificate type (AMAZON_ISSUED, IMPORTED, PRIVATE, or UNKNOWN_TO_SDK_VERSION).
status attribute   certificate_arn, namespace, region   Certificate status (EXPIRED, FAILED, INACTIVE, ISSUED, PENDING_VALIDATION, REVOKED, VALIDATION_TIMED_OUT, or UNKNOWN_TO_SDK_VERSION).
in_use attribute   certificate_arn, namespace, region  

Indicates whether the certificate is in use by another AWS service.

Possible values are Yes or No.

renewal_eligibility attribute   certificate_arn, namespace, region   Indicates whether the certificate is eligible for renewal.
           

AWS/DynamoDB

The AWS/DynamoDB service collects metrics from DynamoDB tables.

Metric name Metric type Unit name Dimension Statistic Description
age_of_oldest_unreplicated_record gauge milliseconds table_name, namespace, region max Elapsed time since a record yet to be replicated to the Kinesis data stream first appeared in the DynamoDB table.
conditional_check_failed_requests gauge   table_name, namespace, region average

Number of failed attempts to perform conditional writes. The PutItem, UpdateItem, and DeleteItem operations let you provide a logical condition that must evaluate to true before the operation can proceed.

If this condition evaluates to false, the ConditionalCheckFailedRequests is incremented by one. ConditionalCheckFailedRequests is also incremented by one for PartiQL Update and Delete statements where a logical condition is provided and that condition evaluates to false.

consumed_change_data_capture_units gauge   table_name, namespace, region average Number of consumed change data capture units.
consumed_read_capacity_units gauge   table_name, namespace, region sum

Number of read capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used.

You can retrieve the total consumed read capacity for a table and all of its global secondary indexes, or for a particular global secondary index.

consumed_write_capacity_units gauge   table_name, namespace, region sum

Number of write capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used.

You can retrieve the total consumed write capacity for a table and all of its global secondary indexes, or for a particular global secondary index.

failed_to_replicate_record_count gauge   table_name, namespace, region average Number of records that DynamoDB failed to replicate to your Kinesis data stream.
online_index_consumed_write_capacity gauge   table_name, namespace, region average

Number of write capacity units consumed when adding a new global secondary index to a table. If the write capacity of the index is too low, the incoming write activity during the backfill phase might be throttled. This can increase the time it takes to create the index.

You should monitor this statistic while the index is being built to determine whether the write capacity of the index is underprovisioned.

online_index_percentage_progress gauge   table_name, namespace, region average

Percentage of completion when a new global secondary index is being added to a table. DynamoDB must first allocate resources for the new index, and then backfill attributes from the table into the index. For large tables, this process might take a long time.

You should monitor this statistic to view the relative progress as DynamoDB builds the index.

online_index_throttle_events gauge   table_name, namespace, region average Number of write throttle events that occur when adding a new global secondary index to a table. These events indicate that the index creation will take longer to complete, because incoming write activity is exceeding the provisioned write throughput of the index.
pending_replication_count gauge   table_name, namespace, region average Metric is for DynamoDB global tables. The number of item updates that are written to one replica table, but that have not yet been written to another replica in the global table.
provisioned_read_capacity_units gauge   table_name, namespace, region average Number of provisioned read capacity units for a table or a global secondary index.
provisioned_write_capacity_units gauge   table_name, namespace, region average Number of provisioned write capacity units for a table or a global secondary index.
read_throttle_events gauge   table_name, namespace, region sum Requests to DynamoDB that exceed the provisioned read capacity units for a table or a global secondary index.
replication_latency gauge milliseconds table_name, namespace, region average Metric is for DynamoDB global tables. The elapsed time between an updated item appearing in the DynamoDB stream for one replica table, and that item appearing in another replica in the global table.
returned_bytes gauge bytes table_name, namespace, region average Number of bytes returned by GetRecords operations (Amazon DynamoDB Streams) during the specified time period.
returned_item_count gauge   table_name, namespace, region average Number of items returned by Query, Scan or ExecuteStatement (select) operations during the specified time period.
returned_records_count gauge   table_name, namespace, region average Number of stream records returned by GetRecords operations (Amazon DynamoDB Streams) during the specified time period.
successful_request_latency gauge milliseconds table_name, namespace, region average

Successful requests to DynamoDB or Amazon DynamoDB Streams during the specified time period.

SuccessfulRequestLatency can provide two different kinds of information: the elapsed time for successful requests (Minimum, Maximum, Sum, or Average) or the number of successful requests (SampleCount).

SuccessfulRequestLatency reflects activity only within DynamoDB or Amazon DynamoDB Streams, and does not take into account network latency or client-side activity.

system_errors gauge   table_name, namespace, region sum Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 500 status code during the specified time period. An HTTP 500 usually indicates an internal service error.
time_to_live_deleted_item_count gauge   table_name, namespace, region average Number of items deleted by Time to Live (TTL) during the specified time period. This metric helps you monitor the rate of TTL deletions on the table.
throttled_put_record_count gauge   table_name, namespace, region average Number of records that were throttled by the Kinesis data stream due to insufficient Kinesis Data Streams capacity.
throttled_requests gauge   table_name, namespace, region sum

Requests to DynamoDB that exceed the provisioned throughput limits on a resource (such as a table or an index).

ThrottledRequests is incremented by one if any event within a request exceeds a provisioned throughput limit. For example, if you update an item in a table with global secondary indexes, there are multiple events—a write to the table, and a write to each index. If one or more of these events are throttled, then ThrottledRequests is incremented by one.

transaction_conflict gauge   table_name, namespace, region sum Rejected item-level requests due to transactional conflicts between concurrent requests on the same items.
user_errors gauge   table_name, namespace, region sum Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 400 status code during the specified time period. An HTTP 400 usually indicates a client-side error, such as an invalid combination of parameters, an attempt to update a non-existent table, or an incorrect request signature.
write_throttle_events gauge   table_name, namespace, region sum Requests to DynamoDB that exceed the provisioned write capacity units for a table or a global secondary index.
status attribute   table_name, namespace, region   Current state of the table.
partition_key attribute   table_name, namespace, region   Value of the defined partition key.
sort_key attribute   table_name, namespace, region   Value of the sort key if sort key has been defined.
read_capacity_mode attribute   table_name, namespace, region   Read capacity can either be On-Demand or Partitioned depending on the read capacity settings.
write_capacity_mode attribute   table_name, namespace, region   Write capacity can either be On-Demand or Partitioned depending on the write capacity settings.
table_class attribute   table_name, namespace, region   The table class of the specified table. Valid values are STANDARD and STANDARD_INFREQUENT_ACCESS.
encryption attribute   table_name, namespace, region   Server-side encryption type. The only supported value is KMS if encryption is defined.
size attribute   table_name, namespace, region   Total size of the specified table, in bytes. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value.
item_count attribute   table_name, namespace, region   Number of items in a table.
replicas attribute   table_name, namespace, region   Number of times the given table has been replicated in other regions.
           

AWS/EBS

The AWS/EBS service collects metrics from non-deleted and non-error EBS volumes.

Metric name Metric type Unit name Dimension Statistic Description
burst_balance gauge percent volume_id, namespace, region average Provides information about the percentage of I/O credits (for gp2) or throughput credits (for st1 and sc1) remaining in the burst bucket.
size attribute gibibytes volume_id, namespace, region average Size of the volume in GiBs.
volume_idle_time gauge seconds volume_id, namespace, region average Total number of seconds in a specified period of time when no read or write operations were submitted.
volume_queue_length gauge   volume_id, namespace, region average Number of read and write operation requests waiting to be completed in a specified period of time.
volume_read_bytes gauge bytes volume_id, namespace, region average Provides information on the read operations in a specified period of time.
volume_read_ops gauge   volume_id, namespace, region average Total number of read operations in a specified period of time. Note that read operations are counted on completion.
volume_total_read_time gauge seconds volume_id, namespace, region average Total number of seconds spent by all read operations that completed in a specified period of time.
volume_total_write_time gauge seconds volume_id, namespace, region average Total number of seconds spent by all write operations that completed in a specified period of time.
volume_write_bytes gauge bytes volume_id, namespace, region average Provides information on the write operations in a specified period of time.
volume_write_ops gauge   volume_id, namespace, region average Total number of write operations in a specified period of time. Note that write operations are counted on completion.
           

AWS/EC2

The AWS/EC2 service collects metrics from non-stopped and non-terminated EC2 instances.

Metric name Metric type Unit name Dimension Statistic Description
cpu_credit_balance gauge minutes instance_id, namespace, region average

Number of earned CPU credits that an instance has accrued since it was launched or started.

For T2 Standard, the CPUCreditBalance also includes the number of launch credits that have been accrued.

cpu_credit_usage gauge minutes instance_id, namespace, region average Number of CPU credits spent by the instance for CPU utilisation. One CPU credit equals one vCPU running at 100% utilisation for one minute, or an equivalent combination of vCPUs, utilisation, and time (for example, one vCPU running at 50% utilisation for two minutes or two vCPUs running at 25% utilisation for two minutes).
cpu_surplus_credit_balance gauge minutes instance_id, namespace, region average Number of surplus credits that have been spent by an unlimited instance when its CPUCreditBalance value is zero.
cpu_surplus_credits_charged gauge minutes instance_id, namespace, region average Number of spent surplus credits that are not paid down by earned CPU credits, and which thus incur an additional charge.
ebs_byte_balance_percent gauge percent instance_id, namespace, region average Provides information about the percentage of throughput credits remaining in the burst bucket. This metric is only available for basic monitoring.
ebs_io_balance_percent gauge percent instance_id, namespace, region average Provides information about the percentage of I/O credits remaining in the burst bucket. This metric is available for basic monitoring only.
ebs_read_ops gauge   instance_id, namespace, region average Completed read operations from all Amazon EBS volumes attached to the instance in a specified period of time.
ebs_write_ops gauge   instance_id, namespace, region average Completed write operations to all EBS volumes attached to the instance in a specified period of time.
ebs_read_bytes gauge bytes instance_id, namespace, region average Bytes read from all EBS volumes attached to the instance in a specified period of time.
ebs_write_bytes gauge bytes instance_id, namespace, region average Bytes written to all EBS volumes attached to the instance in a specified period of time
metadata_no_token gauge   instance_id, namespace, region average Number of times the instance metadata service was successfully accessed using a method that does not use a token.
state attribute   instance_id, namespace, region average Current state of the instance.
instance_type attribute   instance_id, namespace, region average Instance type.
architecture attribute   instance_id, namespace, region average Architecture of the image.
private_ip attribute   instance_id, namespace, region average Private IPv4 address assigned to the instance.
status_check_failed attribute   instance_id, namespace, region average Status checks for instances and systems.
status_check_failed_instance attribute   instance_id, namespace, region average Instance status checks monitor the software and network configuration of your individual instance.
status_check_failed_system attribute   instance_id, namespace, region average System status checks monitor the AWS systems on which your instance runs.
cpu_utilization gauge percent instance_id, namespace, region average

Percentage of the allocated EC2 compute units that are currently in use on the instance.

This metric identifies the processing power required to run an application on a selected instance.

disk_read_bytes gauge bytes instance_id, namespace, region average Bytes read from all instance store volumes available to the instance.
disk_read_ops gauge   instance_id, namespace, region average Completed read operations from all instance store volumes available to the instance in a specified period of time.
disk_write_bytes gauge bytes instance_id, namespace, region average Bytes written to all instance store volumes available to the instance.
disk_write_ops gauge   instance_id, namespace, region average Completed write operations to all instance store volumes available to the instance in a specified period of time.
network_in gauge bytes instance_id, namespace, region average

Number of bytes received by the instance on all network interfaces.

This metric identifies the volume of incoming network traffic to a single instance.

network_out gauge bytes instance_id, namespace, region average

Number of bytes sent out by the instance on all network interfaces.

This metric identifies the volume of outgoing network traffic from a single instance.

network_packets_in gauge   instance_id, namespace, region average

Number of packets received by the instance on all network interfaces.

This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance.

network_packets_out gauge   instance_id, namespace, region average

Number of packets sent out by the instance on all network interfaces.

This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance.

           

AWS/ECS

The AWS/ECS service collects metrics from non-failed and non-inactive ECS clusters.

Metric name Metric type Unit name Dimension Statistic Description
cpu_reservation gauge percent cluster_name, namespace, region average Percentage of CPU units that are reserved by running tasks in the cluster.
cpu_utilization gauge percent cluster_name, namespace, region average Percentage of CPU units that are used in the cluster.
cpu_utilization gauge percent cluster_name, service_name, namespace, region average Percentage of CPU units that are used in the service.
memory_reservation gauge percent cluster_name, namespace, region average Percentage of memory that is reserved by running tasks in the cluster.
memory_utilization gauge percent cluster_name, namespace, region average Percentage of memory that is used in the cluster.
memory_utilization gauge percent cluster_name, service_name, namespace, region average Percentage of memory that is used in the service.
gpu_reservation gauge percent cluster_name, namespace, region average Percentage of total available GPUs that are reserved by running tasks in the cluster.
status attribute   cluster_name, namespace, region average Status of the cluster.
           

AWS/EFS

The AWS/EFS service collects metrics from non-deleted and non-error elastic file systems.

Metric name Metric type Unit name Dimension Statistic Description
percent_io_limit gauge percent file_system_id, namespace, region average Shows how close a file system is to reaching the I/O limit of the General Purpose performance mode.
burst_credit_balance gauge bytes file_system_id, namespace, region average Number of burst credits that a file system has.
permitted_throughput gauge bytes_per_second file_system_id, namespace, region average Maximum amount of throughput that a file system can drive.
metered_io_bytes gauge bytes file_system_id, namespace, region average Number of metered bytes for each file system operation, including data read, data write, and metadata operations, with read operations metered at one-third the rate of other operations.
total_io_bytes gauge bytes file_system_id, namespace, region average Number of bytes for each file system operation, including data read, data write, and metadata operations.
data_read_io_bytes gauge bytes file_system_id, namespace, region average Number of bytes for each file system read operation.
data_write_io_bytes gauge bytes file_system_id, namespace, region average Number of bytes for each file write operation.
metadata_io_bytes gauge bytes file_system_id, namespace, region average Number of bytes for each metadata operation.
client_connections gauge   file_system_id, namespace, region average Number of client connections to a file system.
storage_bytes gauge bytes file_system_id, storage_class, namespace, region sum Size of the file system in bytes, including the amount of data stored in the EFS Standard and EFS Standard–Infrequent Access (EFS Standard-IA) storage classes.
life_cycle_state attribute   file_system_id, namespace, region average Lifecycle phase of the file system.
           

AWS/EKS

The AWS/EKS service collects metrics from non-failed EKS clusters.

Metric name Metric type Dimension Description
status attribute cluster_name, namespace, region Current status of the cluster.
status attribute cluster_name, node_group_name, namespace, region Current status of the managed node group.
       

AWS/ElastiCache

The AWS/ElastiCache service collects metrics from non-deleted Amazon ElastiCache clusters.

Metric name Metric type Unit name Dimension Statistic Description
cpu_utilization gauge percent cache_cluster_id, namespace, region average Percentage of CPU utilisation for the entire host.
cpu_credit_balance gauge minutes cache_cluster_id, namespace, region average Number of earned CPU credits that an instance has accrued since it was launched or started.
cpu_credit_usage gauge minutes cache_cluster_id, namespace, region average Number of CPU credits spent by the instance for CPU utilisation.
freeable_memory gauge bytes cache_cluster_id, namespace, region average Amount of free memory available on the host.
network_bytes_in gauge bytes cache_cluster_id, namespace, region average Number of bytes the host has read from the network.
network_bytes_out gauge bytes cache_cluster_id, namespace, region average Number of bytes sent out on all network interfaces by the instance.
network_packets_in gauge   cache_cluster_id, namespace, region average Number of packets received on all network interfaces by the instance.
network_packets_out gauge   cache_cluster_id, namespace, region average Number of packets sent out on all network interfaces by the instance.
network_bandwidth_in_allowance_exceeded gauge   cache_cluster_id, namespace, region average Number of packets shaped because the inbound aggregate bandwidth exceeded the maximum for the instance.
network_conntrack_allowance_exceeded gauge   cache_cluster_id, namespace, region average Number of packets shaped because connection tracking exceeded the maximum for the instance and new connections could not be established.
network_link_local_allowance_exceeded gauge   cache_cluster_id, namespace, region average Number of packets shaped because the PPS of the traffic to local proxy services exceeded the maximum for the network interface.
network_bandwidth_out_allowance_exceeded gauge   cache_cluster_id, namespace, region average Number of packets shaped because the outbound aggregate bandwidth exceeded the maximum for the instance.
network_packets_per_second_allowance_exceeded gauge   cache_cluster_id, namespace, region average Number of packets shaped because the bidirectional packets per second exceeded the maximum for the instance.
swap_usage gauge bytes cache_cluster_id, namespace, region average Amount of swap used on the host.
active_defrag_hits gauge   cache_cluster_id, namespace, region average Number of value reallocations per minute performed by the active defragmentation process.
authentication_failures gauge   cache_cluster_id, namespace, region average Total number of failed attempts to authenticate to Redis using the AUTH command.
bytes_used_for_cache gauge bytes cache_cluster_id, namespace, region average Total number of bytes allocated by Redis for all purposes, including the dataset, buffers, and so on.
bytes_read_from_disk gauge bytes cache_cluster_id, namespace, region average Total number of bytes read from disk per minute.
bytes_written_to_disk gauge bytes cache_cluster_id, namespace, region average Total number of bytes written to disk per minute.
cache_hits gauge   cache_cluster_id, namespace, region average Number of successful read-only key lookups in the main dictionary.
cache_misses gauge   cache_cluster_id, namespace, region average Number of unsuccessful read-only key lookups in the main dictionary.
command_authorization_failures gauge   cache_cluster_id, namespace, region average Total number of failed attempts by users to run commands they do not have permission to call.
cache_hit_rate gauge percent cache_cluster_id, namespace, region average Indicates the usage efficiency of the Redis instance.
curr_connections gauge   cache_cluster_id, namespace, region average

For Redis, this is the number of client connections, excluding connections from read replicas.

For Memcached, this is a count of the number of connections connected to the cache at an instant in time.

curr_items gauge   cache_cluster_id, namespace, region average For Redis and Memcached, this is the number of items in the cache.
curr_volatile_items gauge   cache_cluster_id, namespace, region average Total number of keys in all databases that have a TTL set.
database_memory_usage_percentage gauge percent cache_cluster_id, namespace, region average Percentage of the memory available for the cluster that is in use.
db0_average_ttl gauge milliseconds cache_cluster_id, namespace, region average Exposes avg_ttl of DBO from the keyspace statistic of the Redis INFO command.
engine_cpu_utilization gauge percent cache_cluster_id, namespace, region average Provides CPU utilisation of the Redis engine thread.
evictions gauge   cache_cluster_id, namespace, region average

For Redis, this is the number of keys that have been evicted due to the maxmemory limit.

For Memcached, this is the number of non-expired items the cache evicted to allow space for new writes.

global_datastore_replication_lag gauge seconds cache_cluster_id, namespace, region average Lag between the secondary region's primary node and the primary region's primary node.
is_primary attribute   cache_cluster_id, namespace, region   Indicates if the node is the primary node of current shard or cluster. The metric can be either 0 (not primary) or 1 (primary).
key_authorization_failures gauge   cache_cluster_id, namespace, region average Total number of failed attempts by users to access keys they do not have permission to access.
keys_tracked gauge   cache_cluster_id, namespace, region average Number of keys being tracked by Redis key tracking as a percentage of tracking-table-max-keys.
memory_fragmentation_ratio gauge   cache_cluster_id, namespace, region average Indicates the efficiency in the allocation of memory of the Redis engine.
new_connections gauge   cache_cluster_id, namespace, region average

For Redis, this is the total number of connections that have been accepted by the server during this period.

For Memcached, this is the number of new connections the cache has received.

num_items_read_from_disk gauge   cache_cluster_id, namespace, region average Total number of items retrieved from disk per minute.
num_items_written_to_disk gauge   cache_cluster_id, namespace, region average Total number of items written to disk per minute.
primary_link_health_status gauge   cache_cluster_id, namespace, region average The value 0 indicates that data in the ElastiCache primary node is not in sync with Redis on EC2. The value of 1 indicates that the data is in sync.
reclaimed gauge   cache_cluster_id, namespace, region average

For Redis, this is the total number of key expiration events.

For Memcached, this is the number of expired items the cache evicted to allow space for new writes.

replication_bytes gauge bytes cache_cluster_id, namespace, region average For nodes in a replicated configuration, ReplicationBytes reports the number of bytes that the primary is sending to all of its replicas.
replication_lag gauge seconds cache_cluster_id, namespace, region average

This metric is only applicable for a node running as a read replica.

It represents how far behind, in seconds, the replica is in applying changes from the primary node.

save_in_progress gauge   cache_cluster_id, namespace, region max This binary metric returns 1 whenever a background saved (forked or forkless) is in progress, and 0otherwise.
cluster_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are cluster-based.
cluster_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of cluster-based commands.
eval_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands for eval-based commands.
eval_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of eval-based commands.
geo_spatial_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands for geospatial-based commands.
geo_spatial_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of geospatial-based commands.
get_type_cmds gauge   cache_cluster_id, namespace, region average Total number of read-only type commands.
get_type_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of read commands.
hash_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are hash-based.
hash_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of hash-based commands.
hyper_log_log_based_cmds gauge   cache_cluster_id, namespace, region average Total number of HyperLogLog-based commands.
hyper_log_log_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of HyperLogLog-based commands.
key_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are key-based.
key_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of key-based commands.
list_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are list-based.
list_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of list-based commands.
pub_sub_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands for pub and sub functionality.
pub_sub_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of pub and sub-based commands.
set_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are set-based.
set_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of set-based commands.
set_type_cmds gauge   cache_cluster_id, namespace, region average Total number of write types of commands.
set_type_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of write commands.
sorted_set_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are sorted set-based.
sorted_set_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of sorted-based commands.
string_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are string-based.
string_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of string-based commands.
stream_based_cmds gauge   cache_cluster_id, namespace, region average Total number of commands that are stream-based.
stream_based_cmds_latency gauge microseconds cache_cluster_id, namespace, region average Latency of stream-based commands.
bytes_read_into_memcached gauge bytes cache_cluster_id, namespace, region average Number of bytes that have been read from the network by the cache node.
bytes_used_for_cache_items gauge bytes cache_cluster_id, namespace, region average Number of bytes used to store cache items.
bytes_written_out_from_memcached gauge bytes cache_cluster_id, namespace, region average Number of bytes that have been written to the network by the cache node.
cas_badval gauge   cache_cluster_id, namespace, region average Number of CAS (check and set) requests the cache has received where the CAS value did not match the CAS value stored.
cas_hits gauge   cache_cluster_id, namespace, region average Number of CAS requests the cache has received where the requested key was found and the CAS value matched.
cas_misses gauge   cache_cluster_id, namespace, region average Number of CAS requests the cache has received where the key requested was not found.
cmd_flush gauge   cache_cluster_id, namespace, region average Number of flush commands the cache has received.
cmd_gets gauge   cache_cluster_id, namespace, region average Number of get commands the cache has received.
cmd_set gauge   cache_cluster_id, namespace, region average Number of set commands the cache has received.
decr_hits gauge   cache_cluster_id, namespace, region average Number of decrement requests the cache has received where the requested key was found.
decr_misses gauge   cache_cluster_id, namespace, region average Number of decrement requests the cache has received where the requested key was not found.
delete_hits gauge   cache_cluster_id, namespace, region average Number of delete requests the cache has received where the requested key was found.
delete_misses gauge   cache_cluster_id, namespace, region average Number of delete requests the cache has received where the requested key was not found.
get_hits gauge   cache_cluster_id, namespace, region average Number of get requests the cache has received where the key requested was found.
get_misses gauge   cache_cluster_id, namespace, region average Number of get requests the cache has received where the key requested was not found.
incr_hits gauge   cache_cluster_id, namespace, region average Number of increment requests the cache has received where the key requested was found.
incr_misses gauge   cache_cluster_id, namespace, region average Number of increment requests the cache has received where the key requested was not found.
bytes_used_for_hash gauge   cache_cluster_id, namespace, region average Number of bytes currently used by hash tables.
cmd_config_get gauge   cache_cluster_id, namespace, region average Cumulative number of config get requests.
cmd_config_set gauge   cache_cluster_id, namespace, region average Cumulative number of config set requests.
cmd_touch gauge   cache_cluster_id, namespace, region average Cumulative number of touch requests.
curr_config gauge   cache_cluster_id, namespace, region average Current number of configurations stored.
evicted_unfetched gauge   cache_cluster_id, namespace, region average Number of valid items evicted from the least recently used cache (LRU) which were never touched after being set.
expired_unfetched gauge   cache_cluster_id, namespace, region average Number of expired items reclaimed from the LRU which were never touched after being set.
slabs_moved gauge   cache_cluster_id, namespace, region average Total number of slab pages that have been moved.
touch_hits gauge   cache_cluster_id, namespace, region average Number of keys that have been touched and were given a new expiration time.
touch_misses gauge   cache_cluster_id, namespace, region average Number of items that have been touched.
new_items gauge   cache_cluster_id, namespace, region average Number of new items the cache has stored.
unused_memory gauge bytes cache_cluster_id, namespace, region average Amount of memory not used by data.
state attribute   cache_cluster_id, namespace, region  

Current state of this cluster.

Possible values: available, creating, deleted, deleting, incompatible-network, modifying, rebooting cluster nodes, restore-failed, or snapshotting.

cache_node_type attribute   cache_cluster_id, namespace, region   Name of the compute and memory capacity node type for the cluster.
cache_cluster_create_time attribute   cache_cluster_id, namespace, region   Date and time when the cluster was created.
cache_subnet_group_name attribute   cache_cluster_id, namespace, region   Name of the cache subnet group associated with the cluster.
endpoint attribute   cache_cluster_id, namespace, region   Represents a Memcached cluster endpoint which can be used by an application to connect to any node in the cluster.
engine attribute   cache_cluster_id, namespace, region   Name of the cache engine (Memcached or Redis) to be used for this cluster.
engine_version attribute   cache_cluster_id, namespace, region   Version of the cache engine that is used in this cluster.
num_cache_nodes attribute   cache_cluster_id, namespace, region   Number of cache nodes in the cluster.
preferred_availability_zone attribute   cache_cluster_id, namespace, region   Name of the Availability Zone in which the cluster is located or "Multiple" if the cache nodes are located in different Availability Zones.
preferred_maintenance_window attribute   cache_cluster_id, namespace, region   Specifies the weekly time range during which maintenance on the cluster is performed.
cache_parameter_group_name attribute   cache_cluster_id, namespace, region   Name of the cache parameter group.
replication_group_id attribute   cache_cluster_id, namespace, region   Replication group to which this cluster belongs.
snapshot_retention_limit attribute   cache_cluster_id, namespace, region   Number of days for which ElastiCache retains automatic cluster snapshots before deleting them.
snapshot_window attribute   cache_cluster_id, namespace, region   Daily time range (in UTC) during which ElastiCache begins taking a daily snapshot of your cluster.
           

AWS/ELB

The AWS/ELB service collects metrics from classic elastic load balancers.

Metric name Metric type Unit name Dimension Statistic Description
backend_connection_errors gauge   load_balancer_name, namespace, region sum

Number of connections that were not successfully established between the load balancer and the registered instances.

Since the load balancer retries the connection when there are errors, this count can exceed the request rate. Note that this count also includes any connection errors related to health checks.

desyncmitigationmode_noncompliant_request_count gauge   load_balancer_name, namespace, region sum HTTP listener: Number of requests that do not comply with RFC 7230.
healthy_host_count gauge   load_balancer_name, namespace, region max

Number of healthy instances registered with the load balancer.

A newly registered instance is considered healthy after it passes the first health check. If cross-zone load balancing is enabled, the number of healthy instances for the LoadBalancerName dimension is calculated across all Availability Zones. Otherwise, it is calculated per Availability Zone.

httpcode_backend_2xx gauge   load_balancer_name, namespace, region sum

HTTP listener: Number of HTTP 2XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

httpcode_backend_3xx gauge   load_balancer_name, namespace, region sum

HTTP listener: Number of HTTP 3XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

httpcode_backend_4xx gauge   load_balancer_name, namespace, region sum

HTTP listener: Number of HTTP 4XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

httpcode_backend_5xx gauge   load_balancer_name, namespace, region sum

HTTP listener: Number of HTTP 5XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

httpcode_elb_4xx gauge   load_balancer_name, namespace, region sum

HTTP listener: Number of HTTP 4XX client error codes generated by the load balancer.

Client errors are generated when a request is malformed or incomplete.

httpcode_elb_5xx gauge   load_balancer_name, namespace, region sum

HTTP listener: Number of HTTP 5XX server error codes generated by the load balancer.

This count does not include any response codes generated by the registered instances. The metric is reported if there are no healthy instances registered to the load balancer, or if the request rate exceeds the capacity of the instances (spillover) or the load balancer.

latency gauge seconds load_balancer_name, namespace, region average

HTTP listener: Total time elapsed, in seconds, from the time the load balancer sent the request to a registered instance until the instance started to send the response headers.

TCP listener: Total time elapsed, in seconds, for the load balancer to successfully establish a connection to a registered instance.

request_count gauge   load_balancer_name, namespace, region sum

Number of requests completed or connections made during the specified interval (1 or 5 minutes).

HTTP listener: Number of requests received and routed, including HTTP error responses from the registered instances.

TCP listener: Number of connections made to the registered instances.

spillover_count gauge   load_balancer_name, namespace, region sum

Total number of requests that were rejected because the surge queue is full.

HTTP listener: Load balancer returns an HTTP 503 error code.

TCP listener: Load balancer closes the connection.

surge_queue_length gauge   load_balancer_name, namespace, region max

Total number of requests (HTTP listener) or connections (TCP listener) that are pending routing to a healthy instance.

The maximum size of the queue is 1024. Additional requests or connections are rejected when the queue is full. For more information, see, spillover_count.

unhealthy_host_count gauge   load_balancer_name, namespace, region max

Number of unhealthy instances registered with your load balancer.

An instance is considered unhealthy after it exceeds the unhealthy threshold configured for health checks. An unhealthy instance is considered healthy again after it meets the healthy threshold configured for health checks.

estimated_alb_active_connection_count gauge   load_balancer_name, namespace, region average Estimated number of concurrent TCP connections active from clients to the load balancer and from the load balancer to targets.
estimated_alb_consumed_lcus gauge   load_balancer_name, namespace, region average Estimated number of load balancer capacity units (LCU) used by an Application Load Balancer. You pay for the number of LCUs that you use per hour.
estimated_alb_new_connection_count gauge   load_balancer_name, namespace, region average Estimated number of new TCP connections established from clients to the load balancer and from the load balancer to targets.
estimated_processed_bytes gauge bytes load_balancer_name, namespace, region average Estimated number of bytes processed by an Application Load Balancer.
dns_name attribute   load_balancer_name, namespace, region   DNS name of the load balancer.
scheme attribute   load_balancer_name, namespace, region   Type of load balancer. Valid only for load balancers in a VPC.
vpc_id attribute   load_balancer_name, namespace, region   ID of the VPC for the load balancer.
           

AWS/Events

The AWS/Events service collects EventBridge metrics.

Metric name Metric type Dimension Statistic Description
dead_letter_invocations gauge rule_name, event_bus_name, namespace, region sum

Number of times a rule’s target is not invoked in response to an event.

This includes invocations that would result in running the same rule again, causing an infinite loop.

failed_invocations gauge rule_name, event_bus_name, namespace, region sum

Number of invocations that failed permanently.

This does not include invocations that are retried or invocations that succeeded after a retry attempt.

It also does not count failed invocations that are counted in DeadLetterInvocations.

invocations gauge rule_name, event_bus_name, namespace, region sum

Number of times a target is invoked by a rule in response to an event.

This includes successful and failed invocations, but does not include throttled or retried attempts until they fail permanently. It does not include DeadLetterInvocations.

EventBridge only sends this metric to CloudWatch if it is not zero.

invocations_failed_to_be_sent_to_dlq gauge rule_name, event_bus_name, namespace, region sum

Number of invocations that cannot be moved to a dead-letter queue.

Dead-letter queue errors occur due to permissions errors, unavailable resources, or size limits. EventBridge only sends this metric to CloudWatch if it isn't zero.

invocations_sent_to_dlq gauge rule_name, event_bus_name, namespace, region sum

Number of invocations that are moved to a dead-letter queue.

EventBridge only sends this metric to CloudWatch if it is not zero.

throttled_rules gauge rule_name, event_bus_name, namespace, region sum Number of rules that have tried to run but are being throttled.
triggered_rules gauge rule_name, event_bus_name, namespace, region  

Number of rules that have run and matched with any event.

You cannot see this metric in CloudWatch until a rule is triggered.

state attribute rule_name, event_bus_name, namespace, region   Indicates if a rule is enabled or disabled.
event_pattern attribute rule_name, event_bus_name, namespace, region   Event pattern that triggers this rule.
schedule_expression attribute rule_name, event_bus_name, namespace, region   Rule is triggered based on the specified schedule expression.
         

AWS/GatewayELB

The AWS/GatewayELB service collects metrics from Gateway Load Balancers.

Metric name Metric type Unit name Dimenision Statistic Description
active_flow_count gauge   load_balancer, namespace, region average Total number of concurrent flows (or connections) from clients to targets.
consumed_lcus gauge   load_balancer, namespace, region average Number of load balancer capacity units (LCU) used by the load balancer.
healthy_host_count gauge   load_balancer, namespace, region max Number of targets that are considered healthy.
new_flow_count gauge   load_balancer, namespace, region sum Total number of new flows (or connections) established from clients to targets in the time period.
processed_bytes gauge bytes load_balancer, namespace, region sum Total number of bytes processed by the load balancer.
unhealthy_host_count gauge   load_balancer, namespace, region max Number of targets that are considered unhealthy.
state attribute   load_balancer, namespace, region   State of the load balancer.
vpc_id attribute   load_balancer, namespace, region   ID of the VPC for the load balancer.
availability_zones attribute   load_balancer, namespace, region   Subnets for the load balancer.
created_time attribute   load_balancer, namespace, region   Date and time the load balancer was created.
scheme attribute   load_balancer, namespace, region   Nodes of an internet-facing load balancer that have public IP addresses.
ip_address_type attribute   load_balancer, namespace, region   Type of IP addresses used by the subnets for the load balancer.
dns_name attribute   load_balancer, namespace, region   Public DNS name of the load balancer.
           

AWS/Kinesis

The AWS/Kinesis service collects metrics from Kinesis streams.

Metric name Metric type Unit name Dimenision Statistic Description
get_records_bytes gauge bytes stream_name, namespace, region average Number of bytes retrieved from the Kinesis stream, measured over the specified time period.
get_records_iterator_age_milliseconds gauge milliseconds stream_name, namespace, region average

Age of the last record in all GetRecords calls made against a Kinesis stream, measured over the specified time period. Age is the difference between the current time and when the last record of the GetRecords call was written to the stream.

A value of zero indicates that the records being read are completely caught up with the stream.

get_records_latency gauge milliseconds stream_name, namespace, region average Time taken per GetRecords operation, measured over the specified time period.
get_records_records gauge   stream_name, namespace, region average Number of records retrieved from the shard, measured over the specified time period.
get_records_success gauge   stream_name, namespace, region average Number of successful GetRecords operations per stream, measured over the specified time period.
incoming_bytes gauge   stream_name, namespace, region average Number of bytes successfully put to the Kinesis stream over the specified time period. This metric includes bytes from PutRecord and PutRecords operations.
incoming_records gauge   stream_name, namespace, region average Number of records successfully put to the Kinesis stream over the specified time period. This metric includes bytes from PutRecord and PutRecords operations.
put_record_bytes gauge bytes stream_name, namespace, region average Number of bytes put to the Kinesis stream using the PutRecord operation over the specified time period.
put_record_latency gauge milliseconds stream_name, namespace, region average Time taken per PutRecord operation, measured over the specified time period.
put_record_success gauge   stream_name, namespace, region average Number of successful PutRecord operations per Kinesis stream, measured over the specified time period. Average reflects the percentage of successful writes to a stream.
put_records_total_records gauge   stream_name, namespace, region average Total number of records sent in a PutRecords operation per Kinesis data stream, measured over the specified time period.
put_records_successful_records gauge   stream_name, namespace, region average Number of successful records in a PutRecords operation per Kinesis data stream, measured over the specified time period.
put_records_failed_records gauge   stream_name, namespace, region average Number of records rejected due to internal failures in a PutRecords operation per Kinesis data stream, measured over the specified time period. Occasional internal failures are to be expected and should be retried.
put_records_throttled_records gauge   stream_name, namespace, region average Number of records rejected due to throttling in a PutRecords operation per Kinesis data stream, measured over the specified time period.
read_provisioned_throughput_exceeded gauge   stream_name, namespace, region average Number of GetRecords calls throttled for the stream over the specified time period.
subscribe_to_shard_rate_exceeded gauge   stream_name, consumer_name, namespace, region average This metric is emitted when a new subscription attempt fails because there already is an active subscription by the same consumer, or if the exceed the number of calls per second allowed for this operation.
subscribe_to_shard_success gauge   stream_name, consumer_name, namespace, region average This metric records whether the SubscribeToShard subscription was successfully established. The subscription only lives for at most 5 minutes. Therefore, this metric gets emitted at least once every 5 minutes.
subscribe_to_shard_event_bytes gauge bytes stream_name, consumer_name, namespace, region average

Number of bytes received from the shard, measured over the specified time period.

Minimum, Maximum, and Average statistics represent the bytes published in a single event for the specified time period.

subscribe_to_shard_event_millis_behind_latest gauge milliseconds stream_name, consumer_name, namespace, region average Difference between the current time and when the last record of the SubscribeToShard event was written to the stream.
subscribe_to_shard_event_records gauge   stream_name, consumer_name, namespace, region average

Number of records received from the shard, measured over the specified time period.

Minimum, Maximum, and Average statistics represent the records in a single event for the specified time period.

subscribe_to_shard_event_success gauge   stream_name, consumer_name, namespace, region average This metric is emitted every time an event is published successfully. It is only emitted when there is an active subscription.
write_provisioned_throughput_exceeded gauge   stream_name, namespace, region average

Number of records rejected due to throttling for the stream over the specified time period.

This metric includes throttling from PutRecord and PutRecords operations. The most commonly used statistic for this metric is Average.

incoming_bytes gauge bytes stream_name, shard_id, namespace, region average Number of bytes successfully put to the shard over the specified time period. This metric includes bytes from PutRecord and PutRecords operations.
incoming_records gauge   stream_name, shard_id, namespace, region average Number of records successfully put to the shard over the specified time period. This metric includes record counts from PutRecord and PutRecords operations.
iterator_age_milliseconds gauge milliseconds stream_name, shard_id, namespace, region average

Age of the last record in all GetRecords calls made against a shard, measured over the specified time period.

Age is the difference between the current time and when the last record of the GetRecords call was written to the stream.

outgoing_bytes gauge bytes stream_name, shard_id, namespace, region average Number of bytes retrieved from the shard, measured over the specified time period.
outgoing_records gauge   stream_name, shard_id, namespace, region average Number of records retrieved from the shard, measured over the specified time period.
read_provisioned_throughput_exceeded gauge   stream_name, shard_id, namespace, region average Number of GetRecords calls throttled for the shard over the specified time period. This exception count covers all dimensions of the following limits: 5 reads per shard per second or 2 MB per second per shard.
write_provisioned_throughput_exceeded gauge   stream_name, shard_id, namespace, region average Number of records rejected due to throttling for the shard over the specified time period. This metric includes throttling from PutRecord and PutRecords operations and covers all dimensions of the following limits: 1,000 records per second per shard or 1 MB per second per shard.
state attribute   stream_name, namespace, region   Indicates whether the stream is being created, active, updating or being deleted.
encryption_type attribute   stream_name, namespace, region   Server-side encryption type used on the stream. Possible values: NONE or KMS.
retention_period_hours attribute   stream_name, namespace, region   Current retention period, in hours. The minimum value is 24, while its maximum value is 168.
           

AWS/KMS

The AWS/KMS service collects metrics from non-deleted KMS keys.

Metric name Metric type Unit name Dimension Statistic Description
seconds_until_key_material_expiration gauge seconds key_id,namespace,region minimum Number of seconds remaining until the imported key material expires.
aliases attribute   key_id,namespace,region   Alternative names for the key in CSV.
status attribute   key_id,namespace,region   Indicates whether the key is enabled, disabled or pending deletion.
key_spec attribute   key_id,namespace,region   Represents the cryptographic configuration of the KMS key.
key_usage attribute   key_id,namespace,region   Indicates the purpose of the key. The value can be either Encrypt and decrypt or Sign and verify.
           

AWS/Lambda

The AWS/Lambda service collects metrics from Lambda functions.

Metric name Metric type Unit name Dimension Statistic Description
invocations gauge   function_name, namespace, region sum Number of times that your function code is invoked, including successful invocations and invocations that result in a function error. Invocations aren't recorded if the invocation request is throttled or otherwise results in an invocation error. This equals the number of requests billed.
errors gauge   function_name, namespace, region sum number of invocations that result in a function error. Function errors include exceptions that your code throws and exceptions that the Lambda runtime throws. The runtime returns errors for issues such as timeouts and configuration errors.
dead_letter_errors gauge   function_name, namespace, region sum For asynchronous invocation, the number of times that Lambda attempts to send an event to a dead-letter queue but fails. Dead-letter errors can occur due to permissions errors, misconfigured resources, or size limits.
destination_delivery_failures gauge   function_name, namespace, region sum For asynchronous invocation, the number of times that Lambda attempts to send an event to a destination but fails. Delivery errors can occur due to permissions errors, misconfigured resources, or size limits.
throttles gauge   function_name, namespace, region sum Number of invocation requests that are throttled. When all function instances are processing requests and no concurrency is available to scale up, Lambda rejects additional requests with a TooManyRequestsException error. Throttled requests and other invocation errors do not count as invocations or errors.
provisioned_concurrency_invocations gauge   function_name, namespace, region sum Number of times that the function code is invoked on provisioned concurrency.
provisioned_concurrency_spillover_invocations gauge   function_name, namespace, region sum Number of times that the function code is invoked on standard concurrency when all provisioned concurrency is in use.
duration gauge milliseconds function_name, namespace, region average Amount of time that the function code spends processing an event. The billed duration for an invocation is the value of duration rounded up to the nearest millisecond.
post_runtime_extensions_duration gauge milliseconds function_name, namespace, region average Cumulative amount of time that the runtime spends running code for extensions after the function code has completed.
iterator_age gauge milliseconds function_name, namespace, region average For event source mappings that read from streams, the age of the last record in the event. The age is the amount of time between when a stream receives the record and when the event source mapping sends the event to the function.
offset_lag gauge   function_name, namespace, region average For self-managed Apache Kafka and Amazon Managed Streaming for Apache Kafka (Amazon MSK) event sources, the difference in offset between the last record written to a topic and the last record that your Lambda function processed. Though a Kafka topic can have multiple partitions, this metric measures the offset lag at the topic level.
concurrent_executions gauge   function_name, namespace, region max Number of function instances that are processing events. If this number reaches the concurrent executions quota for the region, or the reserved concurrency limit that you configured on the function, the Lambda throttles additional invocation requests.
provisioned_concurrent_executions gauge   function_name, namespace, region max Number of function instances that are processing events on provisioned concurrency. For each invocation of an alias or version with provisioned concurrency, Lambda emits the current count.
provisioned_concurrency_utilization gauge   function_name, namespace, region max For a version or alias, the value of ProvisionedConcurrentExecution divided by the total amount of provisioned concurrency allocated. For example, .5 indicates that 50 percent of allocated provisioned concurrency is in use.
description attribute   function_name, namespace, region   Function's description.
package_type attribute   function_name, namespace, region   Type of deployment package. Possible value can be either: ZIP or IMAGE.
runtime attribute   function_name, namespace, region   Runtime environment for the Lambda function.
code_size attribute bytes function_name, namespace, region   Size of the function's deployment package, in bytes.
last_modified attribute   function_name, namespace, region   Date and time that the function was last updated, in ISO-8601 format (YYYY-MM-DDThh:mm:ss.sTZD).
state attribute   function_name, namespace, region   Current state of the function.
           

AWS/Logs

The AWS/Logs service collects metrics from log groups and their subscription filters.

Metric name Metric type Unit name Dimension Statistic Description
incoming_bytes gauge bytes log_group_name, namespace, region sum Volume of log events in uncompressed bytes uploaded to CloudWatch Logs.
incoming_log_events gauge   log_group_name, namespace, region sum Number of log events uploaded to CloudWatchLogs.
subscription_filter_count attribute   log_group_name, namespace, region   Number of subscription filters for this log group.
delivery_errors gauge   log_group_name, destination_type, filter_name, namespace, region sum Number of log events for which CloudWatch Logs received an error when forwarding data to the subscription destination.
delivery_throttling gauge   log_group_name, destination_type, filter_name, namespace, region sum Number of log events for which CloudWatch Logs was throttled when forwarding data to the subscription destination.
forwarded_bytes gauge bytes log_group_name, destination_type, filter_name, namespace, region sum Volume of log events in compressed bytes forwarded to the subscription destination.
forwarded_log_events gauge   log_group_name, destination_type, filter_name, namespace, region sum Number of log events forwarded to the subscription destination.
destination attribute   log_group_name, destination_type, filter_name, namespace, region   Destination set for this log group.
filter_pattern attribute   log_group_name, destination_type, filter_name, namespace, region   Sets the FilterPattern property for this object.
           

AWS/NATGateway

The AWS/NATGateways service collects metrics from non-deleted NAT Gateways.

Metric name Metric type Unit name Dimension Statistic Description
active_connection_count gauge   nat_gateway_id, namespace, region max Total number of concurrent active TCP connections through the NAT gateway.
bytes_in_from_destination gauge bytes nat_gateway_id, namespace, region sum Number of bytes received by the NAT gateway from the destination.
bytes_in_from_source gauge bytes nat_gateway_id, namespace, region sum Number of bytes received by the NAT gateway from clients in your VPC.
bytes_out_to_destination gauge bytes nat_gateway_id, namespace, region sum Number of bytes sent out through the NAT gateway to the destination.
bytes_out_to_source gauge bytes nat_gateway_id, namespace, region sum Number of bytes sent through the NAT gateway to the clients in your VPC.
connection_attempt_count gauge   nat_gateway_id, namespace, region sum Number of connection attempts made through the NAT gateway.
connection_established_count gauge   nat_gateway_id, namespace, region sum Number of connections established through the NAT gateway.
error_port_allocation gauge   nat_gateway_id, namespace, region sum Number of times the NAT gateway could not allocate a source port.
idle_timeout_count gauge   nat_gateway_id, namespace, region sum Number of connections that transitioned from the active state to the idle state.
packets_drop_count gauge   nat_gateway_id, namespace, region sum Number of packets dropped by the NAT gateway.
packets_in_from_destination gauge   nat_gateway_id, namespace, region sum Number of packets received by the NAT gateway from the destination.
packets_in_from_source gauge   nat_gateway_id, namespace, region sum Number of packets received by the NAT gateway from clients in your VPC.
packets_out_to_destination gauge   nat_gateway_id, namespace, region sum Number of packets sent out through the NAT gateway to the destination.
packets_out_to_source gauge   nat_gateway_id, namespace, region sum Number of packets sent through the NAT gateway to the clients in your VPC.
connectivity_type attribute   nat_gateway_id, namespace, region   Indicates whether the NAT gateway supports public or private connectivity.
state attribute   nat_gateway_id, namespace, region   State of the NAT gateway.
state_message attribute   nat_gateway_id, namespace, region   If the NAT gateway could not be created, this specifies the error message for the failure that corresponds to the error code.
elastic_ip_address attribute   nat_gateway_id, namespace, region   Public NAT gateway only: Elastic IP address associated with the NAT gateway.
private_ip_address attribute   nat_gateway_id, namespace, region   Private IP address associated with the NAT gateway.
           

AWS/NetworkELB

The AWS/NetworkELB service collects metrics from Network Load Balancers.

Metric name Metric type Unit name Dimension Statistic Description
active_flow_count gauge   load_balancer, namespace, region average Total number of concurrent flows (or connections) from clients to targets.
active_flow_count_tcp gauge   load_balancer, namespace, region average Total number of concurrent TCP flows (or connections) from clients to targets.
active_flow_count_tls gauge   load_balancer, namespace, region average Total number of concurrent TLS flows (or connections) from clients to targets.
active_flow_count_udp gauge   load_balancer, namespace, region average Total number of concurrent UDP flows (or connections) from clients to targets.
client_tls_negotiation_error_count gauge   load_balancer, namespace, region sum Total number of TLS handshakes that failed during negotiation between a client and a TLS listener.
consumed_lcus gauge   load_balancer, namespace, region average Number of load balancer capacity units (LCU) used by the load balancer.
consumed_lcus_tcp gauge   load_balancer, namespace, region average Number of load balancer capacity units (LCU) used by the load balancer for TCP.
consumed_lcus_tls gauge   load_balancer, namespace, region average Number of load balancer capacity units (LCU) used by the load balancer for TLS.
consumed_lcus_udp gauge   load_balancer, namespace, region average Number of load balancer capacity units (LCU) used by your load balancer for UDP.
healthy_host_count gauge   load_balancer, namespace, region max Number of targets that are considered healthy.
new_flow_count gauge   load_balancer, namespace, region sum The total number of new flows (or connections) established from clients to targets in the time period.
new_flow_count_tcp gauge   load_balancer, namespace, region sum Total number of new TCP flows (or connections) established from clients to targets in the time period.
new_flow_count_tls gauge   load_balancer, namespace, region sum Total number of new TLS flows (or connections) established from clients to targets in the time period.
new_flow_count_udp gauge   load_balancer, namespace, region sum Total number of new UDP flows (or connections) established from clients to targets in the time period.
peak_bytes_per_second gauge bytes per second load_balancer, namespace, region max Highest average throughput (bytes per second), calculated every 10 seconds during the sampling window.
peak_packets_per_second gauge   load_balancer, namespace, region max Highest average packet rate (packets processed per second), calculated every 10 seconds during the sampling window.
port_allocation_error_count gauge   load_balancer, namespace, region sum Total number of ephemeral port allocation errors during a client IP translation operation.
processed_bytes gauge bytes load_balancer, namespace, region sum Total number of bytes processed by the load balancer, including TCP/IP headers.
processed_bytes_tcp gauge bytes load_balancer, namespace, region sum Total number of bytes processed by TCP listeners.
processed_bytes_tls gauge bytes load_balancer, namespace, region sum Total number of bytes processed by TLS listeners.
processed_bytes_udp gauge bytes load_balancer, namespace, region sum Total number of bytes processed by UDP listeners.
processed_packets gauge   load_balancer, namespace, region sum Total number of packets processed by the load balancer.
target_tls_negotiation_error_count gauge   load_balancer, namespace, region sum Total number of TLS handshakes that failed during negotiation between a TLS listener and a target.
tcp_client_reset_count gauge   load_balancer, namespace, region sum Total number of reset (RST) packets sent from a client to a target.
tcp_elb_reset_count gauge   load_balancer, namespace, region sum Total number of reset (RST) packets generated by the load balancer.
tcp_target_reset_count gauge   load_balancer, namespace, region sum Total number of reset (RST) packets sent from a target to a client.
unhealthy_host_count gauge   load_balancer, namespace, region max Number of targets that are considered unhealthy.
state attribute   load_balancer, namespace, region   State of the load balancer.
vpc_id attribute   load_balancer, namespace, region   ID of the VPC for the load balancer.
availability_zones attribute   load_balancer, namespace, region   Subnets for the load balancer.
created_time attribute   load_balancer, namespace, region   Date and time the load balancer was created.
scheme attribute   load_balancer, namespace, region   Nodes of an Internet-facing load balancer that have public IP addresses.
ip_address_type attribute   load_balancer, namespace, region   Type of IP addresses used by the subnets for your load balancer.
dns_name attribute   load_balancer, namespace, region   Public DNS name of the load balancer.
           

AWS/NetworkFirewall

The AWS/NetworkFirewall service collects metrics from VPC firewalls.

Metric name Metric type Dimension Statistic Description
dropped_packets gauge firewall_name, availability_zone, engine, namespace, region sum Number of packets dropped by the Network Firewall firewall.
packets gauge firewall_name, availability_zone, engine, namespace, region sum Number of packets inspected for a firewall policy or stateless rulegroup for which a custom action is defined.
passed_packets gauge firewall_name, availability_zone, engine, namespace, region sum Number of packets that the Network Firewall firewall allowed through to their destinations.
received_packets gauge firewall_name, availability_zone, engine, namespace, region sum Number of packets received by the Network Firewall firewall.
vpc_id attribute firewall_name, availability_zone, engine, namespace, region   Unique identifier of the VPC where the firewall is in use.
         

AWS/RDS

The AWS/RDS service collects metrics from non-failed Amazon relational databases.

Metric name Metric type Unit name Dimension Statistic Description
bin_log_disk_usage gauge bytes db_instance_identifier, namespace, region average Amount of disk space occupied by binary logs. If automatic backups are enabled for MySQL and MariaDB instances, including read replicas, binary logs are created.
burst_balance gauge percent db_instance_identifier, namespace, region average Percent of General Purpose SSD (gp2) burst-bucket I/O credits available.
cpu_utilization gauge percent db_instance_identifier, namespace, region average Percentage of CPU utilisation.
cpu_credit_balance gauge minutes db_instance_identifier, namespace, region average

T2 instances: Number of earned CPU credits that an instance has accrued since it was launched or started.

For T2 Standard, the CPUCreditBalance also includes the number of launch credits that have been accrued.

cpu_credit_usage gauge minutes db_instance_identifier, namespace, region average T2 instances: Number of CPU credits spent by the instance for CPU utilisation. One CPU credit equals one vCPU running at 100 percent utilisation for one minute, or an equivalent combination of vCPUs.
database_connections gauge   db_instance_identifier, namespace, region average Number of client network connections to the database instance.
disk_queue_depth gauge   db_instance_identifier, namespace, region average Number of outstanding I/Os (read and write requests) waiting to access the disk.
ebs_byte_balance_percent gauge percent db_instance_identifier, namespace, region average Percentage of throughput credits remaining in the burst bucket of your RDS database. This metric is available for basic monitoring only.
ebs_io_balance_percent gauge percent db_instance_identifier, namespace, region average Percentage of I/O credits remaining in the burst bucket of your RDS database. This metric is available for basic monitoring only.
failed_sql_server_agent_jobs_count gauge   db_instance_identifier, namespace, region average Number of failed Microsoft SQL Server Agent jobs during the last minute.
freeable_memory gauge bytes db_instance_identifier, namespace, region average Amount of available random access memory.
free_storage_space gauge bytes db_instance_identifier, namespace, region average Amount of available storage space.
maximum_used_transaction_ids gauge   db_instance_identifier, namespace, region average Maximum transaction IDs that have been used. This applies to Postgresql.
network_receive_throughput gauge bytes_per_second db_instance_identifier, namespace, region average Incoming (receive) network traffic on the DB instance.
network_transmit_throughput gauge bytes_per_second db_instance_identifier, namespace, region average Outgoing (transmit) network traffic on the DB instance.
oldest_replication_slot_lag gauge byes db_instance_identifier, namespace, region average Lagging size of the replica lagging the most in terms of write-ahead log (WAL) data received. This applies to Postgresql.
read_iops gauge per_second db_instance_identifier, namespace, region average Average number of disk read I/O operations per second.
read_latency gauge seconds db_instance_identifier, namespace, region average Average amount of time taken per disk I/O operation.
read_throughput gauge bytes_per_second db_instance_identifier, namespace, region average Average number of bytes read from disk per second.
replica_lag gauge seconds db_instance_identifier, namespace, region average Amount of time a read replica DB instance lags behind the source DB instance. This applies to MySQL.
replication_slot_disk_usage gauge bytes db_instance_identifier, namespace, region average Disk space used by replication slot files. This applies to Postgresql.
swap_usage gauge bytes db_instance_identifier, namespace, region average Amount of swap space used on the DB instance. This metric is not available for SQL Server.
transaction_logs_disk_usage gauge bytes db_instance_identifier, namespace, region average Disk space used by transaction logs. This applies to Postgresql.
transaction_logs_generation gauge bytes_per_second db_instance_identifier, namespace, region average Size of transaction logs generated per second. This applies to Postgresql.
write_iops gauge per_second db_instance_identifier, namespace, region average Average number of disk write I/O operations per second.
write_latency gauge seconds db_instance_identifier, namespace, region average Average amount of time taken per disk I/O operation.
write_throughput gauge bytes_per_second db_instance_identifier, namespace, region average Average number of bytes written to disk per second.
db_load_cpu gauge   db_instance_identifier, namespace, region average Number of active sessions where the wait event type is CPU.
db_instance_status attribute   db_instance_identifier, namespace, region itrs_timestamp, data_kind Specifies the current state of this database.
db_name attribute   db_instance_identifier, namespace, region itrs_timestamp, data_kind The meaning of this parameter differs according to the database engine you use. This can be the database name or the database ID.
db_cluster_identifier attribute   db_instance_identifier, namespace, region itrs_timestamp, data_kind Determines if the DB instance is a member of a DB cluster.
engine_name attribute   db_instance_identifier, namespace, region itrs_timestamp, data_kind Name of the database engine to be used for this DB instance.
database_class attribute   db_instance_identifier, namespace, region itrs_timestamp, data_kind Contains the name of the compute and memory capacity class of the DB instance.
           

AWS/S3

The AWS/S3 service collects storage metrics and replication metrics (if any) from the S3 buckets.

Metric name Metric type Unit name Dimension Statistic Description
bucket_size_bytes gauge bytes bucket_name, storage_type, namespace, region average Amount of data in bytes stored in a bucket in the STANDARD storage class, INTELLIGENT_TIERING storage class, Standard-Infrequent Access (STANDARD_IA) storage class, OneZone-Infrequent Access (ONEZONE_IA), Reduced Redundancy Storage (RRS) class, S3 Glacier Instant Retrieval storage class, Deep Archive Storage (S3 Glacier Deep Archive) class or, S3 Glacier Flexible Retrieval (GLACIER) storage class.
number_of_objects gauge   bucket_name, storage_type, namespace, region average Total number of objects stored in a bucket for all storage classes.
all_requests gauge   bucket_name, filter_id, namespace, region sum Total number of HTTP requests made to an Amazon S3 bucket regardless of type.
get_requests gauge   bucket_name, filter_id, namespace, region sum Number of HTTP GET requests made for objects in an Amazon S3 bucket.
put_requests gauge   bucket_name, filter_id, namespace, region sum Number of HTTP PUT requests made for objects in an Amazon S3 bucket.
delete_requests gauge   bucket_name, filter_id, namespace, region sum Number of HTTP DELETE requests made for objects in an Amazon S3 bucket.
head_requests gauge   bucket_name, filter_id, namespace, region sum Number of HTTP HEAD requests made to an Amazon S3 bucket.
post_requests gauge   bucket_name, filter_id, namespace, region sum Number of HTTP POST requests made to an Amazon S3 bucket.
select_requests gauge   bucket_name, filter_id, namespace, region sum Number of Amazon S3 SELECT Object Content requests made for objects in an Amazon S3 bucket.
select_bytes_scanned gauge bytes bucket_name, filter_id, namespace, region sum Number of bytes of data scanned with Amazon S3 SELECT Object Content requests in an Amazon S3 bucket.
select_bytes_returned gauge bytes bucket_name, filter_id, namespace, region sum Number of bytes of data returned with Amazon S3 SELECT Object Content requests in an Amazon S3 bucket.
list_requests gauge   bucket_name, filter_id, namespace, region sum Number of HTTP requests that list the contents of a bucket.
bytes_downloaded gauge bytes bucket_name, filter_id, namespace, region average Number of bytes downloaded for requests made to an Amazon S3 bucket.
bytes_uploaded gauge bytes bucket_name, filter_id, namespace, region average Number of bytes uploaded that contain a request body.
4xx_errors gauge   bucket_name, filter_id, namespace, region average Number of HTTP 4xx client error status code requests made to an Amazon S3 bucket with a value of either 0 or 1.
5xx_errors gauge   bucket_name, filter_id, namespace, region average Number of HTTP 5xx server error status code requests made to an Amazon S3 bucket with a value of either 0 or 1.
first_byte_latency gauge milliseconds bucket_name, filter_id, namespace, region average Per-request time from the complete request being received by an Amazon S3 bucket to when the response starts to be returned.
total_request_latency gauge milliseconds bucket_name, namespace, region average Elapsed per-request time from the first byte received to the last byte sent to an Amazon S3 bucket.
creation_date attribute   bucket_name,namespace,region average Date the bucket was created.
replication_latency gauge seconds rule_id, namespace, region max Maximum number of seconds by which the replication destination Region is behind the source Region for a given replication rule.
bytes_pending_replication gauge bytes rule_id, namespace, region max Total number of bytes of objects pending replication for a given replication rule.
operations_pending_replication gauge bytes rule_id, namespace, region max Number of operations pending replication for a given replication rule.
status attribute   rule_id, namespace, region   Specifies if the rule is enabled.
           

AWS/SDKUsage

The AWS/SDKUsage service collects metrics from SDK Usage metrics.

To get these metrics, you need to enable the AWS SDK Usage metrics collector configuration, AwsSdkUsageMetricsCollector, in the Collection Agent YAML file. See AWS SDK Usage metrics collector.

Metric name Metric type Unit name Dimension Statistic Description
successfulApiCallLast5Min gauge   namespace, service_id, operation_name, region sum Total number of successful API calls from the last 5-minute window.
failedApiCallLast5Min gauge   namespace, service_id, operation_name, region sum Total number of failed API calls from the last 5-minute window.
avgRetryCountLast5Min gauge   namespace, service_id, operation_name, region average Average retry count from the last 5-minute window.
avgApiCallDurationLast5Min gauge milliseconds namespace, service_id, operation_name, region average Average API call duration from the last 5-minute window.
           

AWS/SNS

The AWS/SNS service collects metrics from SNS Topics.

Metric name Metric type Unit name Dimension Statistic Description
number_of_messages_published gauge   topic_name, namespace, region sum Number of messages published to the Amazon SNS topics.
number_of_notifications_delivered gauge   topic_name, namespace, region sum Number of messages successfully delivered from the Amazon SNS topics to subscribing endpoints.
number_of_notifications_failed gauge   topic_name, namespace, region sum Number of messages that Amazon SNS failed to deliver.
number_of_notifications_filtered_out gauge   topic_name, namespace, region sum

Number of messages that were rejected by subscription filter policies.

A filter policy rejects a message when the message attributes do not match the policy attributes.

number_of_notifications_filtered_out_invalid_attributes gauge   topic_name, namespace, region sum Number of messages that were rejected by subscription filter policies because the messages' attributes are invalid. For example, the attribute JSON was formatted incorrectly.
number_of_notifications_filtered_out_no_message_attributes gauge   topic_name, namespace, region sum Number of messages that were rejected by subscription filter policies because the messages have no attributes.
number_of_notifications_redriven_to_dlq gauge   topic_name, namespace, region sum Number of messages that have been moved to a dead-letter queue.
number_of_notifications_failed_to_redrive_to_dlq gauge   topic_name, namespace, region sum Number of messages that could not be moved to a dead-letter queue.
publish_size gauge bytes topic_name, namespace, region average Size of messages being published.
display_name attribute   topic_name, namespace, region   Human-readable name used in the From field for notifications to email and email-json endpoints.
owner attribute   topic_name, namespace, region   Amazon Web Services account ID of the topic's owner.
           

AWS/TransitGateway

The AWS/TransitGateways service collects metrics from non-deleted Transit Gateways.

Metric name Metric type Unit name Dimension Statistic Description
bytes_drop_count_blackhole gauge bytes transit_gateway, namespace, region average Number of bytes dropped because they matched a blackhole route.
bytes_drop_count_no_route gauge bytes transit_gateway, namespace, region average Number of bytes dropped because they did not match a route.
bytes_in gauge bytes transit_gateway, namespace, region average Number of bytes received by the transit gateway.
bytes_out gauge bytes transit_gateway, namespace, region average Number of bytes sent from the transit gateway.
packets_in gauge   transit_gateway, namespace, region average Number of packets received by the transit gateway.
packets_out gauge   transit_gateway, namespace, region average Number of packets sent by the transit gateway.
packet_drop_count_blackhole gauge   transit_gateway, namespace, region average Number of packets dropped because they matched a blackhole route.
packet_drop_count_no_route gauge   transit_gateway, namespace, region average Number of packets dropped because they did not match a route.
bytes_drop_count_blackhole gauge bytes transit_gateway, transit_gateway_attachment, namespace, region average Number of bytes dropped because they matched a blackhole route on the transit gateway attachment.
bytes_drop_count_no_route gauge bytes transit_gateway, transit_gateway_attachment, namespace, region average Number of bytes dropped because they did not match a route on the transit gateway attachment.
bytes_in gauge bytes transit_gateway, transit_gateway_attachment, namespace, region average Number of bytes received by the transit gateway from the attachment.
bytes_out gauge bytes transit_gateway, transit_gateway_attachment, namespace, region average Number of bytes sent from the transit gateway to the attachment.
packets_in gauge   transit_gateway, transit_gateway_attachment, namespace, region average Number of packets received by the transit gateway from the attachment.
packets_out gauge   transit_gateway, transit_gateway_attachment, namespace, region average Number of packets sent by the transit gateway to the attachment.
packet_drop_count_blackhole gauge   transit_gateway, transit_gateway_attachment, namespace, region average Number of packets dropped because they matched a blackhole route on the transit gateway attachment.
packet_drop_count_no_route gauge   transit_gateway, transit_gateway_attachment, namespace, region average Number of packets dropped because they did not match a route on the transit gateway attachment.
description attribute   transit_gateway, namespace, region   Description of the transit gateway.
owner_id attribute   transit_gateway, namespace, region   ID of the Amazon Web Services account that owns the transit gateway.
state attribute   transit_gateway, namespace, region   State of the transit gateway.
           

AWS/VPN

The AWS/VPN service collects metrics from non-deleted Virtual Private Networks.

Metric name Metric type Unit name Dimension Statistic Description
tunnel_state gauge bytes vpn_id, namespace, region average

State of the tunnels.

For static VPNs, 0 indicates DOWN, while 1 indicates UP.

For BGP VPNs, 1 indicates ESTABLISHED, while 0 is used for all other states.

For both types of VPNs, values between 0 and 1 indicate at least one tunnel is not UP.

tunnel_data_in gauge bytes vpn_id, namespace, region sum

Bytes received on the AWS side of the connection through the VPN tunnel from a customer gateway.

Each metric data point represents the number of bytes received after the previous data point.

tunnel_data_out gauge bytes vpn_id, namespace, region sum

Bytes sent from the AWS side of the connection through the VPN tunnel to the customer gateway.

Each metric data point represents the number of bytes sent after the previous data point.

state attribute   vpn_id, namespace, region   Current state of the VPN connection.
category attribute   vpn_id, namespace, region   Category of the VPN connection.
type attribute   vpn_id, namespace, region   Type of VPN connection.
customer_gateway_id attribute   vpn_id, namespace, region   ID of the customer gateway at your end of the VPN connection.
transit_gateway_id attribute   vpn_id, namespace, region   ID of the transit gateway associated with the VPN connection.
vpn_gateway_id attribute   vpn_id, namespace, region   ID of the virtual private gateway at the Amazon Web Services side of the VPN connection.
           

Required AWS plugin permissions

Service Permission
All Services cloudwatch:GetMetricData
All Services cloudwatch:ListMetrics
AWS/ApplicationELB elasticloadbalancingv2:DescribeLoadBalancers
AWS/ApplicationELB elasticloadbalancingv2:DescribeTags
AWS/AutoScaling autoscaling:DescribeAutoScalingGroups
AWS/Billing budgets:ViewBudget
AWS/Billing Billing Services
AWS/CertificateManager acm:ListCertificates
AWS/CertificateManager acm:DescribeCertificate
AWS/CertificateManager acm:ListTagsForCertificate
AWS/DynamoDB dynamodb:DescribeTable
AWS/DynamoDB dynamodb:DescribeTable
AWS/EBS ebs:DescribeVolumes
AWS/EC2 ec2:DescribeInstances
AWS/ECS ecs:DescribeClusters
AWS/ECS ecs:ListClusters
AWS/ECS ecs:ListTagsForResource
AWS/EFS elasticfilesystem:DescribeFileSystems
AWS/EKS eks:DescribeClusters
AWS/EKS eks:DescribeNodegroup
AWS/EKS eks:ListClusters
AWS/EKS eks:ListNodegroups
AWS/ElastiCache elasticache:DescribeCacheClusters
AWS/ElastiCache elasticache:ListTagsForResource
AWS/ELB elasticloadbalancing:DescribeLoadBalancers
AWS/ELB elasticloadbalancing:DescribeTags
AWS/Events eventbridge:ListEventBuses
AWS/Events eventbridge:ListRules
AWS/Events eventbridge:ListTagsForResource
AWS/GatewayELB elasticloadbalancingv2:DescribeLoadBalancers
AWS/GatewayELB elasticloadbalancingv2:DescribeTags
AWS/Kinesis kinesis:ListStreams
AWS/Kinesis kinesis:DescribeStream
AWS/Kinesis kinesis:ListTagsForStream
AWS/KMS kms:ListKeys
AWS/KMS kms:ListAliases
AWS/KMS kms:ListResourceTags
AWS/Lambda lambda:ListFunctions
AWS/Lambda lambda:ListTags
AWS/Logs logs:DescribeLogGroups
AWS/Logs logs:DescribeSubscriptionFilters
AWS/Logs logs:ListTagsLogGroup
AWS/NATGateway ec2:DescribeNatGateways
AWS/NetworkELB elasticloadbalancingv2:DescribeLoadBalancers
AWS/NetworkELB elasticloadbalancingv2:DescribeTags
AWS/NetworkFirewall network-firewall:ListFirewalls
AWS/NetworkFirewall network-firewall:DescribeFirewall
AWS/RDS rds:DescribeDBInstances
AWS/S3 s3:GetBucketTagging
AWS/S3 s3:ListBucket
AWS/S3 s3:ListAllMyBuckets
AWS/S3 s3:GetBucketLocation
AWS/SNS sns:ListTopics
AWS/SNS sns:GetTopicAttributes
AWS/SNS sns:ListTagsForResource
AWS/TransitGateway ec2:DescribeTransitGateways
AWS/VPN ec2:DescribeVpnConnections