Introduction

An Azure Machine Learning (Azure ML) Workspace is a centralized platform within Azure Machine Learning Services that enables data scientists and developers to efficiently manage machine learning (ML) projects. It acts as a collaborative environment for building, training, deploying, and monitoring ML models while ensuring security and scalability.

Use OpsRamp Azure Public Cloud Integration to discover and collect metrics against Machine Learning Services Workspaces.

Setup

To set up the Azure integration and discover the Azure Machine Learning Services Workspaces resources, do the following:

  1. Create an Azure Integration, if not available in your installed integrations. For more information on how to install the Azure Integration, refer to Install Azure Integration.
  2. Create a discovery profile. For more information on how to create a discovery profile, refer to Create Discovery Profile.
  3. Select Machine Learning Services Workspaces under the Filter Criteria in the Edit Discovery Profile page.
  4. Save the discovery profile to make them available in the list of Discovery Profiles.
  5. Scan to discover the resources at any time independent of the predefined schedule.
  6. Once the scan is completed, you can view the Machine Learning Services Workspaces resources under Infrastructure > Resources > Microsoft Azure category.

Event support

OpsRamp supports Azure events for Machine Learning Services Workspaces. Configure Azure Events in the OpsRamp Azure integration discovery profile.

See Process Azure Events for more information on how to configure Azure events.

Supported metrics

OpsRamp MetricAzure MetricMetric Display NameUnitDescriptionAggregation Type
azure_ml_services_workspaces_Active_CoresActive CoresActive CoresCountNumber of active cores in the Azure ML workspace.Average
azure_ml_services_workspaces_Active_NodesActive NodesActive NodesCountNumber of active nodes in the Azure ML workspace.Average
azure_ml_services_workspaces_Cancel_Requested_RunsCancel Requested RunsCancel Requested RunsCountNumber of runs where cancelation was requested.Total
azure_ml_services_workspaces_Cancelled_RunsCancelled RunsCancelled RunsCountNumber of runs canceled in the workspace.Total
azure_ml_services_workspaces_Completed_RunsCompleted RunsCompleted RunsCountNumber of successfully completed runs.Total
azure_ml_services_workspaces_CpuUtilizationCpuUtilizationCPU UtilizationPercentPercentage of memory utilization on a CPU node.Average
azure_ml_services_workspaces_ErrorsErrorsErrorsCountNumber of run errors in this workspace.Total
azure_ml_services_workspaces_Failed_RunsFailed RunsFailed RunsCountNumber of failed runs.Total
azure_ml_services_workspaces_Finalizing_RunsFinalizing RunsFinalizing RunsCountNumber of runs in the finalizing state.Total
azure_ml_services_workspaces_GpuUtilizationGpuUtilizationGPU UtilizationPercentPercentage of memory utilization on a GPU node.Average
azure_ml_services_workspaces_Idle_CoresIdle CoresIdle CoresCountNumber of idle cores.Average
azure_ml_services_workspaces_Idle_NodesIdle NodesIdle NodesCountNumber of idle nodes.Average
azure_ml_services_workspaces_Leaving_CoresLeaving CoresLeaving CoresCountIndicates the number of cores that are no longer in use.Average
azure_ml_services_workspaces_Model_Deploy_FailedModel Deploy FailedModel Deploy FailedCountNumber of failed model deployments.Total
azure_ml_services_workspaces_Model_Deploy_StartedModel Deploy StartedModel Deploy StartedCountNumber of started model deployments.Total
azure_ml_services_workspaces_Model_Deploy_SucceededModel Deploy SucceededModel Deploy SucceededCountNumber of successful model deployments.Total
azure_ml_services_workspaces_Model_Register_FailedModel Register FailedModel Registration FailureCountCounts the total instances of model registration failures in this workspace.Total
azure_ml_services_workspaces_Model_Register_SucceededModel Register SucceededModel Registration SuccessCountCounts the total instances of successful model registrations in this workspace.Total
azure_ml_services_workspaces_Not_Responding_RunsNot Responding RunsUnresponsive RunsCountIndicates the total number of runs that are unresponsive for this workspace.Total
azure_ml_services_workspaces_Not_Started_RunsNot Started RunsPending RunsCountCounts the number of runs that are in a Not Started state for this workspace.Total
azure_ml_services_workspaces_Preempted_CoresPreempted CoresPreempted CoresCountIndicates the number of cores that were preempted.Average
azure_ml_services_workspaces_Preempted_NodesPreempted NodesPreempted NodesCountIndicates the number of nodes that were preempted.Average
azure_ml_services_workspaces_Preparing_RunsPreparing RunsPreparing RunsCountCounts the total number of runs currently in preparation for this workspace.Total
azure_ml_services_workspaces_Provisioning_RunsProvisioning RunsProvisioning RunsCountCounts the total number of runs that are currently provisioning in this workspace.Total
azure_ml_services_workspaces_Queued_RunsQueued RunsQueued RunsCountNumber of runs in the queue.Total
azure_ml_services_workspaces_Quota_Utilization_PercentageQuota Utilization PercentageQuota Utilization PercentagePercentPercentage of quota utilized in the workspace.Average
azure_ml_services_workspaces_Started_RunsStarted RunsActive RunsCountCounts the number of runs that are actively running for this workspace.Total
azure_ml_services_workspaces_Starting_RunsStarting RunsStarting RunsCountCounts the total number of runs that have been initiated for this workspace.Total
azure_ml_services_workspaces_Total_CoresTotal CoresTotal CoresCountTotal number of cores available.Average
azure_ml_services_workspaces_Total_NodesTotal NodesTotal NodesCountTotal number of nodes available.Average
azure_ml_services_workspaces_Unusable_CoresUnusable CoresUnusable CoresCountNumber of unusable cores in the workspace.Average
azure_ml_services_workspaces_Unusable_NodesUnusable NodesUnusable NodesCountNumber of unusable nodes in the workspace.Average
azure_ml_services_workspaces_WarningsWarningsWarningsCountNumber of warnings related to runs in this workspace.Total