Metrics QL Engine

The Metrics QL Engine collects and stores monitoring data, and displays the same data visually in the form of time series graphs.

Accepted Metrics Notation

Allowed Regex for Metric names : [a-zA-Z_:][a-zA-Z0-9_:]*

Examples:

system_cpu_utilization

kubevirt_vmi_network_traffic_bytes_total

Limitations for Labels:

DescriptionValue
Accepted Label Notation[a-zA-Z_][a-zA-Z0-9_]*
Support maximum number of labels30
limit of Label Value lengthNo restriction

MetricsQL API reference link: https://develop.opsramp.com/docs/v2/metricsql

OpsRamp hierarchy of Metrics

Client
Hosts/Resources
    — Number of Metrics
      — Number of Instances

Limitations on Hosts or instances supported in MetricsQL query:

Example: Let us consider a client has 10000 resources. The number of metrics for 10000 resources is say, 100000.
Note: These numbers are not constant; they keep changing.
The overall Time Series Response should not exceed the 16 MB limit. The browser keeps crashing.
Solution: Defining the query response below 5 MB data is advised. For API response/querying, add more labels in the PromQL query expression.

In order to avoid browser crash issue, for example, you can create different dashboards where you can query on different device types - Say, for Linux OS you can create one dashboard, for Windows OS another dashboard, and so on. Read further to understand more about querying.

PromQL

Prometheus provides a functional query language called PromQL (Prometheus Query Language) that lets the user select and aggregate time series data in real time.

PromQL Expression Language Data Types:

Range Vector: A set of time series containing a range of data points over a time for each time series.

Instant Vector: A set of time series containing a single sample for each time series, all sharing the same timestamp.

Scalar: A simple numeric floating point value.

String: A simple string value; currently unused.

PromQL Expressions:

Query Use Case for Time Series DataPromQL QueryData TypeResultTypeDescription
On Metric Level(__name__)system_cpu_utilizationRange VectorMatrixQuerying data using Metric Name(__name__) label
On Metric +multiple label Combinationsystem_cpu_utilization{type="RESOURCE",uuid="4530d51c-3b32-4a91-ae2e-160f50f50d94"}Range VectorMatrixQuerying data using metric name , Resource Type and Resource Unique_id combination
On Resource Type and Resource Unique Id combination without Metric Name{type="RESOURCE",uuid="4530d51c-3b32-4a91-ae2e-160f50f50d94"}Range VectorMatrixQuerying data with the Combination of Resource type and Unique Id without Metric Name.
Multiple label values selection using Regexsystem_cpu_utilization{instance=~"CPU"}Range VectorMatrixQuerying data on Multiple instance values selection using Regx
Multiple Metric Name Selection using Regex{__name__=~"system_cpu_utilization|system_ping_pl"}Range VectorMatrixQuerying data on Multiple Metric Names
Based on Text Match Regex{__name__=~".*ping_.*"}Range VectorMatrixQuerying data on Regex Matching
Count By Instance based on Metric Namecount by (instance) (system_cpu_utilization)Range VectorMatrixQuerying data by instances
Predicts the data based on last samplespredict_linear(demo_disk_usage_bytes[4h], 3600)Predicts a value in 1 hour, based on last 4 hours

Regex Matchers:

= : Equality
!= : Non-equality
=~ : Regex match
!~ : Negative regex match

Aggregation Operators:

Aggregation OperatorPromQL QueryFor Labels data along with time seriesDescription
avgavg(system_cpu_utilization)avg by (instance) (system_cpu_utilization)Fetch Time Series data using avg()
bottomkbottomk (5,system_cpu_utilization)Fetch nth bottom Time Series data
countcount(system_cpu_utilization)Fetch Count
maxmax(system_cpu_utilization)max by (instance) (system_cpu_utilization)Fetch Time series data using max()
minmin(system_cpu_utilization)min by (instance) (system_cpu_utilization)Fetch Time series data using min()
sumsum(system_cpu_utilization)sum by (instance)(system_cpu_utilization) sum without (instance) (system_cpu_utilization) [without|by (Fetch Time series data using sum aggregate function
topktopk (5,system_cpu_utilization)Fetch top nth Time Series data

Instant Queries:

Agg_over_time FunctionsPromQL QueryTypeDescription
avg_over_timeavg_over_time(system_cpu_utilization[1h])[1h:1h]Instant VectorFetch time series data over period of 1h with resolution 1h
min_over_timemin_over_time(system_cpu_utilization{type="RESOURCE"}[1h])[1h:1h]Instant VectorFetch time series data over period of 1h with resolution 1h
max_over_timemax_over_time(system_cpu_utilization{type="RESOURCE"}[1h])[1h:1h]Instant VectorFetch time series data over period of 1h with resolution 1h
sum_over_timesum_over_time(system_cpu_utilization{type="RESOURCE"}[1h])[1h:1h]Instant VectorFetch time series data over period of 1h with resolution 1h
count_over_timecount_over_time(system_cpu_utilization{type="RESOURCE"}[1h])[1h:1h]Instant VectorFetch time series data over period of 1h with resolution 1h
last_over_timelast_over_time(system_cpu_utilization{type="RESOURCE"}[1h])[1h:1h]Instant VectorFetch time series data over period of 1h with resolution 1h
topktopk(5, avg_over_time(system_cpu_utilization{type="DEVICE"}[1h])[1h:1h]Instant VectorApplying aggregation Operators on agg_over_time functions

PromQL Operators

PromQL supports basic logical and arithmetic operators.

Arithmetic binary operators

Prometheus supports the following binary arithmetic operators:

  • + (add)
  • (subtract)
  • * (multiply)
  • / (divide)
  • % (percentage)
  • ^ (exponents)

Comparison Binary Operators

Prometheus supports the following binary comparison operators:

  • == (equal to)
  • != (not equal)
  • > (greater than)
  • < (less than)
  • >= (greater than or equal to)
  • <= (less than or equal to)

Aggregation Operators

Prometheus supports the following aggregation operators:

  • sum
  • avg
  • min
  • max
  • group
  • count
  • count_values
  • topk (k = the number of elements; this selects the largest values among those elements)
  • bottomk (like topk but for lowest values)
  • quantile (calculate a quantile over dimensions)
  • stddev (standard deviation over dimensions)
  • stdvar (standard variance over dimensions)

PromQL Functions

Prometheus supports the following functions:

  • abs()
  • absent()
  • absent_over_time()
  • ceil()
  • changes()
  • clamp_max()
  • clamp_min()
  • day_of_month()
  • day_of_week()
  • days_in_month()
  • delta()
  • deriv()
  • exp()
  • floor()
  • histogram_quantile()
  • holt_winters()
  • hour()
  • idelta()
  • increase()
  • irate()
  • label_join()
  • label_replace()
  • ln()
  • log2()
  • log10()
  • minute()
  • month()
  • predict_linear()
  • rate()
  • resets()
  • round()
  • scalar()
  • sgn()
  • sort()
  • sort_desc()
  • sqrt()
  • time()
  • timestamp()
  • vector()
  • year()
  • avg_over_time()
  • min_over_time()
  • max_over_time()
  • sum_over_time()
  • count_over_time()
  • quantile_over_time()
  • stddev_over_time()
  • stdvar_over_time()
  • last_over_time()
  • present_over_time()

How to use PromQL queries

PromQL provides the flexibility to query using metrics, functions, operators and labels and get the desired results in the form of graphs.

Query syntax: The syntax can vary based on the requirement.

PROMQL query box:
Use the autocomplete feature to show all available matching functions, labels (tags) and metrics. As soon as you start typing the function name or metric name in the box, the available functions and metrics are displayed automatically. Hover over a function, a tooltip is displayed providing information on its Definition and Usage.

LEGEND box: Use double curly bracket “{{” to see the available values. These values are displayed as Legend.

Example 1: The following query returns time series data with a given metric name:

metric name: “system_memory_usage_physical” – This shows the current memory used.

Type the metric name in the PROMQL query box.
Type a legend in the LEGEND box.

Time series data is represented graphically.

Example 2: You can use an “aggregation operator” and a “metric” to fetch time series data:

Type the aggregation operator followed by the metric name enclosed in round brackets in the PROMQL query box.
Type a legend in the LEGEND box.

The above query returns the maximum value of the time series data in the given range.

Example 3: The following example returns an empty vector if the range vector passed to it has any elements and a 1-element vector with the value 1 if the range vector passed to it has no elements. This is useful for alerting on when no time series exist for a given metric name and label combination for a certain amount of time.

Enter the function name followed by the metric name, label(s), and the time period.

As there is no data for the specific period for the resource, it returns the value “1”.

For detailed information on querying, click Querying Prometheus