Training File

Describes how to use a training file to train learning-based alert management policies.

Leave Feedback

Introduction

A CSV (Comma-separated Values) file is the training file that serves as the input data for machine learning to understand alert patterns and drive the following:

  • First response
  • Alert escalation

The data is saved in a table structured format. The CSV file is generally a text file containing information separated by commas and plays a key role in alert management.

Creating a CSV file

Create a new CSV file or modify the sample CSV file (downloadable from OpsRamp).

  • Be sure to save new CSV files as CSV UTF-8 (Comma delimited).
  • Cells can be left empty (without providing any values) to indicate All Other use cases.
Empty Cell in CSV File

Empty Cell in CSV File

When a machine-learning algorithm is doing the prediction, it uses the row which has the most exact matches. If two rows have the same amount of exact matching values, the machine-learning algorithm uses the first row it encountered.

Example of empty cell usage

A user wants to route alerts in the following manner:

  • The disk.utilization metric that is triggered on Windows resources go to the Windows Disk Management Support team.
  • All other metrics for Windows resources go to the Windows Support team.

The following shows an empty metric cell on the second row which indicates all other metrics.

Empty Cell in CSV File

Empty Cell in CSV File

When a machine-learning algorithm is doing the prediction, it uses the row which has the most exact matches. If two rows have the same amount of exact matching values, the machine-learning algorithm uses the first row it encountered.

Sample CSV file for alert escalation

The sample CSV file (downloadable from OpsRamp) contains the fields clientUniqueId, metric, resource.generalInfo.resourceType, component, incident.assigneeGroup.name.

Sample CSV file for first response policy

The sample CSV file (downloadable from OpsRamp) contains the fields clientUniqueId, metric, resource.generalInfo.resourceType, component, suppressed.

Sample Alert Suppression

Sample Alert Suppression

Key considerations

Alert, resource, incident attributes, and metric names are case-sensitive.

The following table provides the metric name representation for a particular monitor.

Monitor Name and Agent
Monitor NameG2 Agent
Disk UtilizationDISK
Memory UtilizationMEMORY
Windows ServiceWINDOWS_SERVICES

Input column attributes

Types of input columns available are Predefined atrributes and Custom attributes.

  • Predefined: Predefined attributes are the default attributes supported for a training file. The attributes are derived from alert and resoource attributes. Use the Get Alert API to retrieve alert attributes and Get Resource API to retrieve resource attributes.

Following are example column names for alert and alert resource attributes:

Alert attributesAlert resource attributes
clientUniqueIdresource.state
metricresource.generalInfo.resourceType
componentresource.generalInfo.make
alertTyperesource.generalInfo.osName
currentStateresource.location.name
statusresource.deviceGroup.name
priorityresource.serviceGroup.name
elapsedTimeString-
healedTimeString-
repeatCount-
  • Custom: Custom attributes are defineable and these attributes drive escalating and suppressing alerts. For example, a critical application, myapp, is associated with a custom attribute called myappName with a value of myapp1. Defining custom attributes in the training file allows machine-learning to take action on myapp alerts.

To use a custom attribute, the process is to:

  1. Create a training file
  2. Specify a column with resource.tag.<tag_name>. For example, resource.tag.myapp1.
  3. Add the values of the custom attributes.
    Note: Multiple tag names are allowed in the columns.
Example CSV File with Custom Attributes

Example CSV File with Custom Attributes

Considerations

The following are consideration for adding resource group or service group attributes:

  • To specify more than one device group or service group in the column names resource.deviceGroup.name and resource.serviceGroup.name list each group separately in a row.
  • If a group has parents, use the full path to specify child group. For example, to specify a grandChild group, provide the full path as Parent > Child > grandChild. Similarly, to specify a child group, provide the path as Parent > Child in the training file.

Output column atrributes

The following the column names are used for learned configurations:

Alert Escalation:

  • incident.assigneeGroup.name
  • incident.category.name
  • incident.subCategory.name
  • incident.priority
  • incident.cc
  • incident.businessImpact.name
  • incident.urgency.name
  • incident.knowledgeArticleIds

First Response: suppressed

FAQs

What are the agent versions?

  • Windows Agent: G2 Agent version starts with 6 series. For example: 6.00.0011
  • Linux Agent: G2 Agent version starts with 4. or 5. series. For example: 5.2.1-1

How do I check the agent version on a resource?

To check the agent version:

  1. From All Clients, select a client.
  2. Select Infrastructure, select the resource type, and click on the resource name.
  3. From the left pane, click Attributes.
  4. Click More Information and view the OpsRamp Agent Version.