Notifications

The Notifications app closes the loop between data collection, signal generation, and alerting. When the condition of your IT estate needs attention, the app conveniently notifies you through its integration with external systems such as Slack.

You can configure notifications in two ways: either for groups of entities or for individual entities. Grouped notifications are enabled if there is at least one grouping in the configuration.

Notification list

Grouped notification lifecycle Copied

Grouped notifications are formed by bundling Obcerv entities that have common characteristics. For example, if you group by container, it will create a group per Kubernetes container that grows or shrinks seamlessly (as entities such as pods or volumes match or cease to match the grouping parameters).

Grouped notifications are triggered when at least one entity in the group exceeds the configured warning or critical trigger interval. Once a notification is triggered, reminders are periodically sent following the reminder trigger interval.

The notification is cleared when all entities in the group exceed the clear trigger interval and no additional entities in the group have been triggered.

Entity notification lifecycle Copied

Entity notifications are triggered when an individual entity exceeds the configured warning or critical trigger interval. Once a notification is sent, reminders are periodically sent following the reminder trigger interval. The notification is cleared when the entity exceeds the clear trigger interval.

Configuration Copied

To create a new configuration, follow these steps:

  1. From the Web Console, select Notifications.
  2. Click Add Notification.

Create notification

  1. Specify the Name for the new configuration. You can also add a description.
  2. On the Filter field, set whether to include or exclude entities when evaluating notifications for individual entities or groups.
  3. On the Group by field, select entities with common attributes or dimensions that will be grouped together. A single notification will be sent for each group.

A group is an implicit filter so any entities that do not have the group attributes or dimensions will be excluded. For example:

Entity (Illustrative name) Environment OS
Prod_Linux Prod Linux
Prod_Mac Prod MacOS
Prod_Win Prod Windows
Test_Linux Test Linux
Test_Mac Test MacOS
Test_Win Test Windows
Dev_Linux Dev Linux
Dev_Mac Dev MacOS
Dev_Win Dev Windows
Dev_None Dev

Given the above entities and grouping configuration OS, the following groups will be created with these entities:

Group Entities
Linux Prod_Linux
Test_Linux
Dev_Linux
MacOS Prod_MacOS
Test_MacOS
Dev_MacOS
Windows Prod_Win
Test_Win
Dev_Win

Only one notification per group will be sent and entity Dev_None will be disregarded since it doesn’t have the group attribute OS.

  1. Under Targets, select a target type, and then choose an option from the Targets drop-down list. Depending on the target type, you can set more options. For example, if you selected SLACK, you will see the Reply in Slack thread toggle.

Note

If you need to create a new target, go to Notifications > Targets, and then click Add trigger icon beside the target type. For more information about the available target types, see Integrations.
  1. Check the options under the Triggers and Messages section. For more information, see Triggers and Messages.
  2. Click Save to add the new configuration.

Triggers Copied

A notification is triggered and cleared based on an entity’s severity.

Triggers

Warning / Critical (triggered) Copied

If an entity has had a critical or warning severity for longer than the supplied duration, then it is considered triggered and a notification is sent, either individually for the entity or as part of a group (if groups are configured). Where a severity is rapidly changing between warning and critical states, the warning trigger also considers entities with critical severity.

Cleared Copied

If an entity has been previously triggered and has had no severity for the supplied duration, then it is considered cleared and a notification is sent indicating that the entity is healthy.

Reminders Copied

Reminder notifications apply to triggered entities and groups, and are periodically sent as a reminder that the entity/group is still in a triggered state.

Messages Copied

Notification messages must be configured for each enabled trigger. It consists of the main message body and an optional short title. The title’s usage depends on the integration. For example, for Slack notifications, it is used as a header and as the text of push notifications. It is not used for Webhooks.

Messages may contain placeholders of the form ${placeholder} that will be interpolated by the app. The available placeholders can be seen by clicking Placeholder icon .

Placeholders

Some placeholders are only available for grouped notifications while others are only available for entity notifications. The supported placeholders can be seen below.

Placeholder Group Entity Description
${date} The current date in UTC (for example, 2011-12-03).
${time} The current time in UTC (for example, 15:14:11).
${dateTime} The current date-time in UTC (for example, 2007-12-03T10:15:30).
${url} The URL to the Notifications app.
${severity} The entity’s severity or the group’s maximum severity.
${entity} The entity object as a JSON object.
${dimensions} The entity dimensions as a JSON object.
${entity.attribute[<attribute>]} The entity attribute value. If the attribute is not found, an empty string is returned.
${entity.dimension[<dimension>]} The entity dimension value. If the dimension key is not found, an empty string is returned.
${triggeredCount} The number of entities in the group that have been triggered.
${criticalCount} The number of entities in the group that have critical severity.
${okCount} The number of entities in the group that are triggered but have no severity at the time the notification is sent. This is relevant for reminder notifications where an entity’s severity has recently been cleared but does not meet the configured cleared trigger criteria yet.
${warningCount} The number of entities in the group that have a warning severity.
${clearedCount} The number of entities in the group that are no longer triggered.
${group} The entity group as a list of key/value pairs (e.g. [Kind: Database, Environment: DevOps]).
${group[<group>]} The entity group value. If the group is not found, an empty string is returned.

Illustrative example Copied

Severity Example

Severity Example Legend

Consider the above timeline of an entity’s severity where the numbers represent minutes, and assuming that the configuration of all triggers is set to 1 minute.

Minute(s) Action
0-4 No notifications are sent because the entity has no severity.
4-5 No notifications are sent because the entity’s severity is not continuously warning or critical for one minute.
5-6 No notifications are sent because the entity has no severity.
7 A triggered notification is sent because the entity’s severity has been in warning state for at least one minute.
7-12 Reminders are sent every minute as the entity is still in warning or critical state.
13 A clear notification is sent because the entity has had no severity for at least one minute.

Integrations Copied

The app integrates with the following third-party systems:

Instrumentation Copied

The app leverages an in-house StatsD client to record metrics regarding its internal state. The following is a complete list of metrics collected by the app:

Metric Type Unit Dimensions Description
Notifications Queued gauge instance, system, notification_id The total number of notifications currently queued to be sent.
Notifications Sent gauge instance, system, notification_id The total number of notifications sent.
Notification Count counter instance, system, notification_id The notification count accrued over regular intervals.
Notifications Rejected gauge instance, system, notification_id The total number of notifications that were rejected. This may occur if the queue is already full.
Notifications Failed gauge instance, system, notification_id The total number of notifications that were attempted to be sent but failed.
Notifications Evicted gauge instance, system, notification_id The total number of notifications evicted from the queue. This may occur if the configuration associated with the notification was removed.
Notification Failed At gauge epoch milliseconds instance, system, notification_id, target_id The timestamp of the most recent notification failure expressed in milliseconds since 01 January 1970 UTC.
Notification Succeeded At gauge epoch milliseconds instance, system, notification_id, target_id The timestamp of the most recent successful notification expressed in milliseconds since 01 January 1970 UTC.
Notification Failure Message attribute instance, system, notification_id, target_id The notification failure message corresponding to the most recent notification failure.
Notification Queue Size gauge node, namespace, pod, container, notifier The current notification queue occupancy.
Notification Queue Capacity gauge node, namespace, pod, container, notifier The maximum number of notifications that can be queued.
Response Time histogram nanoseconds node, namespace, pod, container, notifier The response time of the remote call that sends the notification to an external system.
Number of Entries gauge node, namespace, pod, container, cache The total number of entries held in the cache.
Average Entry Size gauge bytes node, namespace, pod, container, cache The average cache entry size.
Average Chunk Size gauge bytes node, namespace, pod, container, cache The average chunk size.
Average Entries Per Chunk gauge node, namespace, pod, container, cache The average number of entries in each chunk.
Number of Chunks gauge node, namespace, pod, container, cache The total number of chunks.
Entity Updates gauge node, namespace, pod, container, cache The number of entity updates processed.
Entity Removals gauge node, namespace, pod, container, cache The number of entity evictions processed.

Notification storage and retrieval Copied

Notifications are currently recorded in logs and can be retrieved from the Logs screen of the Web Console.

From the Logs screen, set the dimension filter in the From field by inputting the string {container="obcerv-app-notifications-notifier"}|logfmt followed by additional filtering parameters for the information you would like to extract.

Notification examples Copied

All triggered notifications that have been sent:

{container="obcerv-app-notifications-notifier"}|logfmt|state="SENT"|type="TRIGGERED"

All notifications that were attempted to be sent but failed:

{container="obcerv-app-notifications-notifier"}|logfmt|state="FAILED"

All triggered notifications for the configuration called “Obcerv License”:

{container="obcerv-app-notifications-notifier"}|logfmt|state="SENT"|type="TRIGGERED"|notificationName="Obcerv License"

The following filtering tags are available:

Tag Value(s)
state QUEUED, SENT, FAILED, EVICTED
type TRIGGERED, REMINDER, CLEARED
notifier SLACK, WEBHOOK
targetName The name of the notification target.
notificationId The ID of the notification.
notificationName The name of the notification.
severity The entity’s severity or the group’s maximum severity.
message A message providing additional context regarding the notification state.
group The triggered group (for example, [Kind: Database, Environment: DevOps]).
triggeredCount The number of entities that have been triggered.
criticalCount The number of critical entities.
warningCount The number of entities that have a warning severity.
clearedCount The number of entities that have been cleared.
dimensions The entity dimensions (for example, {pod=web-console-abc, node=itrlab}).

Audit log storage and retrieval Copied

Audit logs can be retrieved from the Logs screen of the Web Console.

From the Logs screen, set the dimension filter in the From field by inputting the string {container="obcerv-app-notifications"}|logfmt|class="AUDIT" followed by additional filtering parameters for the information you would like to extract.

Audit log examples Copied

All notifications that have been created, updated, or deleted by user djohn:

{container="obcerv-app-notifications"}|logfmt|class="AUDIT"|resource="NOTIFICATION"|user="djohn"

All targets that were deleted by user admin:

{container="obcerv-app-notifications"}|logfmt|class="AUDIT"|resource="TARGET"|action="DELETE"|user="admin"

Notifications or targets that were updated by any user:

{container="obcerv-app-notifications"}|logfmt|class="AUDIT"|action="UPDATE"

The following filtering tags are available:

Tag Value(s)
class AUDIT
action CREATE, UPDATE, DELETE
resource NOTIFICATION, TARGET
type SLACK, WEBHOOK, SERVICE_NOW, NONE
name Name of the notification or target.
id ID of the notification or target.
message A human readable audit message.
user User that initiated the change.
["Obcerv"] ["Obcerv > Notifications"] ["User Guide"]

Was this topic helpful?