Alert Correlation is the process of grouping similar alerts as an inference to reduce the load from processing multiple alerts.

Alert correlation is site-specific. Alerts from different sites need to be managed separately so are not correlated.

Prerequisite

  • OpsQ View and OpsQ Manage permissions are required to access alert correlation policies.
  • Partner Administrator or Client Administrator roles are required to create an alert correlation policy.

Create an alert correlation policies

  1. From All Clients, select a client.

  2. Go to Setup > Alerts > Alert Correlation and click Create New.

  3. From CREATE ALERT CORRELATION POLICY, enter a policy Name.

  4. Select the Client and Mode.

  5. In the Filter Criteria, toggle the Apply Filter Criteria button to ON.

  6. Choose ANY or ALL to specify rule-matching constraints.

  7. Select Native Attributes to filter resources based on predefined attributes.

  8. Select the rule conditions you want from the drop-down list, entering values as needed.

  9. Click the + symbol to add more rules.

  10. In the Policy Definition section, enter the Inference subject. You can use alerts and resource tokens to configure the inference subject. If a subject is not entered, the subject of the first alert as the inference subject.

  11. Select Alert sequence recommended by the machine learning model or Within time window for how you want to correlate using time.

    • If you selected Alert sequence recommended by the machine learning model, you can upload a CSV file to configure topology.
    • If you selected the Within time window option, select the time from the drop-down list.
  12. Optionally, click +Add Similarity Rule and select the attribute, and specify the matching condition from the drop-down list.

Edit an alert correlation policy

  1. Select a client from the All Clients list.
  2. Go to Setup >Alerts > Alert Correlation. is displayed.
  3. From the ALERT CORRELATION POLICIES page, click the required alert correlation policy name.
  4. Click Edit and configure the policy details.
  5. Click Save.

Change an alert correlation policy state

  1. Select a client from the All Clients list.
  2. Go to Setup > Alert Management > Alert Correlation. The Alert Correlation Policy page is displayed with the list of all Alert Correlation Policies created.
  3. From the Alert Correlation Policy page, select the desired mode from Mode drop-down menu. The selected mode is displayed in the Mode column.

Delete an alert correlation policy

If needed, you can delete the alert correlation policy. When deleted, the correlation of alerts getting newly ingested to the system and matching the deleted alert correlation policy does not happen. Alert Correlation Policies are deleted in the following situations:

  • The device/resource generating the alerts is unavailable.
  • You do not want to correlate the alerts.

To delete the alert correlation policy:

  1. Select a client from the All Clients list.
  2. Go to Setup > Alert Management > Alert Correlation.
  3. From ALERT CORRELATION POLICIES LIST, select the checkbox of desired policy name and click Delete.
  4. From the confirmation popup, click Yes to delete. The selected alert correlation policy gets deleted.

Define correlation precedence

Precedence determines the order of execution for an alert correlation policy. For example, if VMware is part of an agent status alert correlation policy and a network outage alert correlation policy, you can determine which alert correlation policy should execute first to correlate VMware alerts.

To determine the precedence:

  1. From the All Clientslist, select a client.
  2. Go to Setup > Alerts > Alert Correlation.
  3. Drag and place the inference in the appropriate row to adjust the order. The number in the alert correlation policy Precedence column changes accordingly.

View alert sequences

The Alert Sequence Clusters window helps you to visualize the detected alert sequences in your environment. You can view the alert sequences detected from the existing alert data and sequences related to an inference.

These sequences are unmodified alert sequences fetched from the existing alert data. You can view the alert sequences detected from the existing alert data and sequences related to an inference.

Similar alert sequences are grouped and provide a count for each sequence to help visualize the alert sequences and the number of times alerts are triggered in a sequence.

The Alert sequence clusters window serves as a verification of ML correlation. For example, if ML correlates alerts cpu.utilization and system.ping, you can use the Alert Sequence Clusters window to find the sequences that have both cpu.utilization and system.ping.

To view the alert sequences detected from existing alert data:

  1. Select a client from the All Clients list.
  2. Go to Setup > Alert Management > Alert Correlation.
  3. Click an ML-based alert correlation policy. Note: You can easily identify an ML-based alert correlation policy. The ML Status against the policy contains a status, such as Training Started or Ready.
  4. From the Policy Definition field, click Detected alert sequence patterns in alert data.

To view alert sequences related to an inference:

  1. Click All Clients, select a client.
  2. Go to Alerts.
  3. Click the required inference name. Alert Details page is displayed.
  4. Select the Correlated Alerts tab to display the list of correlated alerts.
  5. Click Show detected alert sequence patterns.

Remove alerts from an inference

You can remove alerts from an inference. The alerts can be removed from either the Quick view window or the Alert Details page.

For example, if you do not want an alert to be correlated, you can remove an alert from the inference. The removed alert appears on the alerts browser as an individual alert.

If an inference has two correlated alerts, removing one of the alerts creates individual alerts and the inference is automatically correlated.

To remove alerts from the quick view:

  1. On the Alerts Browser page, enter the alert ID in the search box. The alert is displayed on the Browser page along with the number of correlated alerts.
  2. Click the number adjacent to the alert subject.
    Number of Correlated Alerts

    Number of Correlated Alerts

  3. Select the required alert and click Remove.
    Number of Correlated Alerts

    Number of Correlated Alerts

The alert is removed from the inference. A comment appears in the Details tab as shown in the below screenshot.

Alert Removed from an Inference

Alert Removed from an Inference

View inference statistics

Inference Stats widget displays the statistics of inferences generated within a partner or client.

The widget has the following properties:

PropertyDescription
Total EventsThe total number of events generated.
Total AlertsThe total number of alerts created after ingestion.
Total InferencesThe total number of Inferences generated.
Total Correlated AlertsThe total number of alerts correlated.
Volume OptimizedThe percent reduction in alert volume from alert correlation.

Create an Inference Stats widgets

  1. Select a client from the All Clients list.

  2. Go to Dashboard > +Add Widget.

  3. From OTHER PREDEFINED WIDGET, click Inference Stats.

  4. Configure the following parameters:

    • Time Range: Filter for inferences triggered within a time interval. The default time interval is the last four hours.
    • Refresh every: The frequency at which the Widget should refresh and display the recent data. The default refresh time is five minutes.
    • Inference Stats: The mode of inferences that must be included in the widget
      • Select Enabled policies only to view the statistics of enabled (ON mode) inferences.
        • If this mode is selected, the total number of inferences and the total number of correlated alerts created from the enabled correlation policies appear on the widget.
        • In this widget, the volume optimization is based on inferences and correlated alerts created from the enabled correlation policies.
      • Select Enabled and Observed policies to view statistics of enabled and observed inferences.
        • If this mode is selected, the total number of inferences and the total number of correlated alerts created from both the enabled and observed correlation policies appear on the widget.
        • In this widget, the volume optimization is based on the inferences and correlated alerts created from both the enabled and observed correlation policies.
    • Widget Title: The name of a Widget
    • Select the Chart Style and click Save.

The Inference Stats widget is created and appears on the dashboard.

Inference Stats Widget

Inference Stats Widget

Scenarios

Correlate alerts due to an unexpected cause

The DevOps team released a new code update for an application running on multiple servers. The update has a bug that is exceeding memory utilization thresholds on each application instance, generating multiple critical and warning alerts. The DevOps team is receiving multiple alerts making it difficult to diagnose the problem.

  1. Define an Alert Correlation Policy to correlate alerts that have similar content.
  2. Configure an alert condition on the Alert Source attribute to filter alerts that generate from the same application name.
  3. Alerts that generate within the specific time interval possessing the application name are correlated to form an inference.

Correlate alerts that share similar properties

A customer restarts the agent on VMware resources. As a result, multiple alerts on agent status are generated causing high alerting noise. The customer wants the agent status alerts generated within one hour to display as a single alert to reduce the alert noise.

Define an Alert Correlation policy to correlate the agent status alerts on VMware resources generated with a span of one hour. Enter the metric that monitors the agent status of the resources.

  1. Filter for VMware resources on which the Alert Correlation Policy should be applied using Native/Custom attributes.
  2. From the Policy Definition section, select the time interval from the Within Time window drop-down.
  3. Click +Add Alert Similarity and select Alert Metric
  4. From the attribute drop-down list, select the operator, and enter an Agent Status value. Alerts during the specific time interval are correlated to create an inference.

What to do next

Review Managing Inferences