Documentation is now available for the Fall 2020 Update release!

Google Dataproc Cluster

Leave Feedback

Introduction

Cloud Dataproc is a managed Apache Spark and Apache Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning.

Cloud Dataproc automation helps you create clusters quickly, manage them easily, and save money by turning clusters off when you don’t need them. With less time and money spent on administration, you can focus on your jobs and your data.

Setup

To set up the OpsRamp Google integration and discover the Google service, go to Google Integration Discovery Profile and select GOOGLE/Dataproc Cluster.

Metrics

OpsRamp MetricMetric Display NameUnitAggregation TypeDescription
google_dataproc_cluster_hdfs_datanodesCluster Hdfs DatanodesCountAverageIndicates the number of HDFS DataNodes that are running inside a cluster.
google_dataproc_cluster_hdfs_storage_capacityCluster Hdfs Storage CapacityCountAverageIndicates capacity of HDFS system running on cluster in GB.
google_dataproc_cluster_hdfs_storage_utilizationCluster Hdfs Storage UtilizationCountAverageThe percentage of HDFS storage currently used.
google_dataproc_cluster_hdfs_unhealthy_blocksCluster Hdfs Unhealthy BlocksCountAverageIndicates the number of unhealthy blocks inside the cluster.
google_dataproc_cluster_job_completion_timeCluster Job Completion TimeCountAverageThe time jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed.
google_dataproc_cluster_job_durationCluster Job DurationCountAverageThe time jobs have spent in a given state.
google_dataproc_cluster_job_failed_countCluster Job Failed CountCountAverageIndicates the number of jobs that have failed on a cluster.
google_dataproc_cluster_job_running_countCluster Job Running CountCountAverageIndicates the number of jobs that are running on a cluster.
google_dataproc_cluster_job_submitted_countCluster Job Submitted CountCountAverageIndicates the number of jobs that have been submitted to a cluster.
google_dataproc_cluster_operation_completion_timeCluster Operation Completion TimeCountAverageThe time operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed.
google_dataproc_cluster_operation_durationCluster Operation DurationCountAverageThe time operations have spent in a given state.
google_dataproc_cluster_operation_failed_countCluster Operation Failed CountCountAverageIndicates the number of operations that have failed on a cluster.
google_dataproc_cluster_operation_running_countCluster Operation Running CountCountAverageIndicates the number of operations that are running on a cluster.
google_dataproc_cluster_operation_submitted_countCluster Operation Submitted CountCountAverageIndicates the number of operations that have been submitted to a cluster.
google_dataproc_cluster_yarn_allocated_memory_percentageCluster Yarn Allocated Memory PercentageCountAverageThe percentage of YARN memory is allocated.
google_dataproc_cluster_yarn_appsCluster Yarn AppsCountAverageIndicates the number of active YARN applications.
google_dataproc_cluster_yarn_containersCluster Yarn ContainersCountAverageIndicates the number of YARN containers.
google_dataproc_cluster_yarn_memory_sizeCluster Yarn Memory SizeCountAverageIndicates the YARN memory size in GB.
google_dataproc_cluster_yarn_nodemanagersCluster Yarn NodemanagersCountAverageIndicates the number of YARN NodeManagers running inside cluster.
google_dataproc_cluster_yarn_pending_memory_sizeCluster Yarn Pending Memory SizeCountAverageThe current memory request, in GB, that is pending to be fulfilled by the scheduler.
google_dataproc_cluster_yarn_virtual_coresCluster Yarn Virtual CoresCountAverageIndicates the number of virtual cores in YARN.

Event support

  • Not supported

External reference