Notifications

The Notifications app closes the loop between data collection, signal generation, and alerting. When the condition of your IT estate needs attention, the app conveniently notifies you through its integration with external systems such as Slack.

You can configure notifications in two ways: either for groups of entities or for individual entities. Grouped notifications are enabled if there is at least one grouping in the configuration.

Notification list

Grouped notification lifecycle

Grouped notifications are formed by bundling Obcerv entities that have common characteristics. For example, if you group by container, it will create a group per Kubernetes container that grows or shrinks seamlessly (as entities such as pods or volumes match or cease to match the grouping parameters).

Grouped notifications are triggered when at least one entity in the group exceeds the configured warning or critical trigger interval. Once a notification is triggered, reminders are periodically sent following the reminder trigger interval.

The notification is cleared when all entities in the group exceed the clear trigger interval and no additional entities in the group have been triggered.

Entity notification lifecycle

Entity notifications are triggered when an individual entity exceeds the configured warning or critical trigger interval. Once a notification is sent, reminders are periodically sent following the reminder trigger interval. The notification is cleared when the entity exceeds the clear trigger interval.

Configuration

To create a new configuration, follow these steps:

  1. From the Web Console, select Alerting > Notifications.
  2. Click Add New.

Create notification

  1. Specify the Name for the new configuration.
  2. On the Conditions field, provide conditions to include or exclude entities from being taken into consideration when evaluating notifications for individual entities or groups.
  3. On the Group by field, filter entities with common attributes or dimensions that will be grouped together. A single notification will be sent for each group.

A group is an implicit filter so any entities that do not have the group attributes or dimensions will be excluded. For example:

Entity (Illustrative name) Environment OS
Prod_Linux Prod Linux
Prod_Mac Prod MacOS
Prod_Win Prod Windows
Test_Linux Test Linux
Test_Mac Test MacOS
Test_Win Test Windows
Dev_Linux Dev Linux
Dev_Mac Dev MacOS
Dev_Win Dev Windows
Dev_None Dev

Given the above entities and grouping configuration OS, the following groups will be created with these entities:

Group Entities
Linux Prod_Linux
Test_Linux
Dev_Linux
MacOS Prod_MacOS
Test_MacOS
Dev_MacOS
Windows Prod_Win
Test_Win
Dev_Win

Only one notification per group will be sent and entity Dev_None will be disregarded since it doesn’t have the group attribute OS.

  1. Click Add Target, and then choose a connection type. Provide which settings to use. For more information, see Triggers and Messages. When done, click OK.
  2. Click Save to create the new configuration.

Triggers

A notification is triggered and cleared based on an entity’s severity.

Triggers

Warning / Critical (triggered)

If an entity has had a critical or warning severity for longer than the supplied duration, then it is considered triggered and a notification is sent, either individually for the entity or as part of a group (if groups are configured). Where a severity is rapidly changing between warning and critical states, the warning trigger also considers entities with critical severity.

Cleared

If an entity has been previously triggered and has had no severity for the supplied duration, then it is considered cleared and a notification is sent indicating that the entity is healthy.

Reminders

Reminder notifications apply to triggered entities and groups, and are periodically sent as a reminder that the entity/group is still in a triggered state.

Messages

Notification messages must be configured for each enabled trigger. It consists of the main message body and an optional short title. The title’s usage depends on the integration. For example, for Slack notifications, it is used as a header and as the text of push notifications. It is not used for Webhooks.

Messages may contain placeholders of the form ${placeholder} that will be interpolated by the app. The available placeholders can be seen by clicking Placeholder icon .

Placeholders

Some placeholders are only available for grouped notifications while others are only available for entity notifications. The supported placeholders can be seen below.

Placeholder Group Entity Description
${date} The current date in UTC (for example, 2011-12-03).
${time} The current time in UTC (for example, 15:14:11).
${dateTime} The current date-time in UTC (for example, 2007-12-03T10:15:30).
${url} The URL to the Notifications app.
${severity} The entity’s severity or the group’s maximum severity.
${entity} The entity object as a JSON object.
${dimensions} The entity dimensions as a JSON object.
${triggeredCount} The number of entities in the group that have been triggered.
${criticalCount} The number of entities in the group that have critical severity.
${okCount} The number of entities in the group that are triggered but have no severity at the time the notification is sent. This is relevant for reminder notifications where an entity’s severity has recently been cleared but does not meet the configured cleared trigger criteria yet.
${warningCount} The number of entities in the group that have a warning severity.
${clearedCount} The number of entities in the group that are no longer triggered.
${group} The entity group as a list of key/value pairs (e.g. [Kind: Database, Environment: DevOps]).

Illustrative example

Severity Example

Severity Example Legend

Consider the above timeline of an entity’s severity where the numbers represent minutes, and assuming that the configuration of all triggers is set to 1 minute.

Minute(s) Action
0-4 No notifications are sent because the entity has no severity.
4-5 No notifications are sent because the entity’s severity is not continuously warning or critical for one minute.
5-6 No notifications are sent because the entity has no severity.
7 A triggered notification is sent because the entity’s severity has been in warning state for at least one minute.
7-12 Reminders are sent every minute as the entity is still in warning or critical state.
13 A clear notification is sent because the entity has had no severity for at least one minute.

Integrations

The app integrates with the following third-party systems:

Instrumentation

The app leverages an in-house StatsD client to record metrics regarding its internal state. The following is a complete list of metrics collected by the app:

Metric Type Unit Dimensions Description
Notifications Queued gauge instance, system, notification_id The total number of notifications currently queued to be sent.
Notifications Sent gauge instance, system, notification_id The total number of notifications sent.
Notification Count counter instance, system, notification_id The notification count accrued over regular intervals.
Notifications Rejected gauge instance, system, notification_id The total number of notifications that were rejected. This may occur if the queue is already full.
Notifications Failed gauge instance, system, notification_id The total number of notifications that were attempted to be sent but failed.
Notifications Evicted gauge instance, system, notification_id The total number of notifications evicted from the queue. This may occur if the configuration associated with the notification was removed.
Notification Queue Size gauge node, namespace, pod, container, notifier The current notification queue occupancy.
Notification Queue Capacity gauge node, namespace, pod, container, notifier The maximum number of notifications that can be queued.
Response Time histogram nanoseconds node, namespace, pod, container, notifier The response time of the remote call that sends the notification to an external system.
Number of Entries gauge node, namespace, pod, container, cache The total number of entries held in the cache.
Average Entry Size gauge bytes node, namespace, pod, container, cache The average cache entry size.
Average Chunk Size gauge bytes node, namespace, pod, container, cache The average chunk size.
Average Entries Per Chunk gauge node, namespace, pod, container, cache The average number of entries in each chunk.
Number of Chunks gauge node, namespace, pod, container, cache The total number of chunks.
Entity Updates gauge node, namespace, pod, container, cache The number of entity updates processed.
Entity Removals gauge node, namespace, pod, container, cache The number of entity evictions processed.

Notification storage and retrieval

Notifications are currently recorded in logs and can be retrieved from the Logs screen of the Web Console.

From the Logs screen, set the dimension filter in the From field by inputting the string {container="obcerv-app-notifications-notifier"}|logfmt followed by additional parameters for the information you would like to extract.

Examples

All triggered notifications that have been sent:

{container="obcerv-app-notifications-notifier"}|logfmt|state="SENT"|type="TRIGGERED"

All notifications that were attempted to be sent but failed:

{container="obcerv-app-notifications-notifier"}|logfmt|state="FAILED"

All triggered notifications for the configuration called “Obcerv License”:

{container="obcerv-app-notifications-notifier"}|logfmt|state="SENT"|type="TRIGGERED"|notificationName="Obcerv License"

The following filtering tags are available:

Tag Value(s)
state QUEUED, SENT, FAILED, EVICTED
type TRIGGERED, REMINDER, CLEARED
notifier SLACK, WEBHOOK
targetName The name of the notification target.
notificationId The ID of the notification.
notificationName The name of the notification.
severity The entity’s severity or the group’s maximum severity.
message A message providing additional context regarding the notification state.
group The triggered group (for example, [Kind: Database, Environment: DevOps]).
triggeredCount The number of entities that have been triggered.
criticalCount The number of critical entities.
warningCount The number of entities that have a warning severity.
clearedCount The number of entities that have been cleared.
dimensions The entity dimensions (for example, {pod=web-console-abc, node=itrlab}).
["Obcerv"] ["User Guide"]

Was this topic helpful?