Introductions

Apache Spark is an open-source, distributed computing system designed for fast, large-scale data processing. It provides a unified platform for processing and analyzing big data using a variety of different computing models, including batch processing, real-time streaming, machine learning, and graph processing.

Kubernetes 2.0 ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: workload-master
  namespace: opsramp-agent
data:
  workloads: |
    apachespark:
      - name: spark 
        collectionFrequency: 2m
        auth: none
        endpoint: http://localhost:4040
        targetPodSelector:
          matchLabels:
            - key: app
              operator: ==
              value:
                - spark

Supported Metrics

Supported metrics for this workload as provided by the Kubernetes 2.0 Agent.

Metric NameDescription
spark.applications.activeThe number of active Spark applications in the cluster.
spark.executor.heartbeat.durationThe time between heartbeats sent by the executor to the driver.
spark.executor.queuedThe number of tasks waiting to be executed in the executor queue.
spark.executor.completedThe total number of tasks completed by the executor.
spark.executor.failedThe total number of tasks that failed in the executor.
spark.executor.taskTimeTotal time taken to run tasks in the executor (in milliseconds).
spark.executor.memoryMetrics.storageMemoryUsedThe amount of memory used by storage in the executor (in bytes).
spark.executor.memoryMetrics.offHeapMemoryUsedThe amount of off-heap memory used by the executor (in bytes).
spark.executor.diskIO.readTotal amount of data read from disk by the executor (in bytes).
spark.executor.diskIO.writeTotal amount of data written to disk by the executor (in bytes).
spark.task.timeTotal time taken to execute a task across all stages (in milliseconds).
spark.task.failuresThe total number of task failures in a Spark application.
spark.task.shuffle.readThe total amount of shuffle data read by tasks (in bytes).
spark.task.shuffle.writeThe total amount of shuffle data written by tasks (in bytes).
spark.streaming.batchTimeThe duration of each batch interval in a Spark Streaming application (in milliseconds).
spark.streaming.input.rateThe rate at which input data is received by a Spark Streaming application (in records per second).
spark.streaming.output.rateThe rate at which data is written out by a Spark Streaming application (in records per second).
spark.streaming.received.recordsThe total number of records received by a Spark Streaming application.
spark.streaming.processed.recordsThe total number of records processed by a Spark Streaming application.
spark.sql.cache.bytesThe amount of memory used by Spark SQL’s query cache (in bytes).
spark.sql.parquet.cache.bytesThe amount of memory used by cached Parquet data in Spark SQL (in bytes).
spark.sql.execution.timeThe total time taken by Spark SQL queries (in milliseconds).
spark.sql.shuffle.sort.timeTime spent by Spark SQL in sorting shuffle data (in milliseconds).
spark.sql.shuffle.read.bytesThe total amount of shuffle data read by Spark SQL (in bytes).
spark.sql.shuffle.write.bytesThe total amount of shuffle data written by Spark SQL (in bytes).
spark.sql.jdbc.connectionsThe number of active JDBC connections in Spark SQL.
spark.sql.execution.cachingThe number of times data is cached in Spark SQL queries.
spark.stage.completedThe total number of completed stages in a Spark job.
spark.stage.activeThe number of stages currently running in the Spark job.
spark.stage.failedThe total number of failed stages in a Spark job.
spark.stage.taskTimeThe total time taken for task execution in each stage (in milliseconds).
spark.executor.peakMemoryUsedThe peak memory usage by the executor (in bytes).
spark.executor.taskFailuresThe total number of task failures in the executor.
spark.executor.pendingThe number of tasks pending execution in the executor.
spark.executor.runningThe number of tasks currently running in the executor.
spark.driver.memoryThe memory allocated to the Spark driver (in bytes).
spark.driver.coresThe number of cores allocated to the Spark driver.
spark.application.startTimeThe start time of a Spark application (in milliseconds).
spark.application.durationThe total duration of the Spark application (in milliseconds).
spark.application.endTimeThe end time of a Spark application (in milliseconds).
spark.jobs.completedThe total number of completed jobs in the Spark application.
spark.jobs.failedThe total number of failed jobs in the Spark application.
spark.jobs.runningThe total number of running jobs in the Spark application.
spark.jobs.pendingThe total number of pending jobs in the Spark application.
spark.jobs.succeededThe total number of successful jobs in the Spark application.
spark.rdd.cache.memoryUsedThe total amount of memory used by cached RDDs (in bytes).
spark.rdd.diskUsedThe total amount of disk space used by cached RDDs (in bytes).