Spark Receiver

Introductions

Apache Spark is an open-source, distributed computing system designed for fast, large-scale data processing. It provides a unified platform for processing and analyzing big data using a variety of different computing models, including batch processing, real-time streaming, machine learning, and graph processing.

Kubernetes 2.0 ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: workload-master
  namespace: opsramp-agent
data:
  workloads: |
    apachespark:
      - name: spark 
        collectionFrequency: 2m
        auth: none
        endpoint: http://localhost:4040
        targetPodSelector:
          matchLabels:
            - key: app
              operator: ==
              value:
                - spark

Supported Metrics

Supported metrics for this workload as provided by the Kubernetes 2.0 Agent.

Metric Name	Description
spark.applications.active	The number of active Spark applications in the cluster.
spark.executor.heartbeat.duration	The time between heartbeats sent by the executor to the driver.
spark.executor.queued	The number of tasks waiting to be executed in the executor queue.
spark.executor.completed	The total number of tasks completed by the executor.
spark.executor.failed	The total number of tasks that failed in the executor.
spark.executor.taskTime	Total time taken to run tasks in the executor (in milliseconds).
spark.executor.memoryMetrics.storageMemoryUsed	The amount of memory used by storage in the executor (in bytes).
spark.executor.memoryMetrics.offHeapMemoryUsed	The amount of off-heap memory used by the executor (in bytes).
spark.executor.diskIO.read	Total amount of data read from disk by the executor (in bytes).
spark.executor.diskIO.write	Total amount of data written to disk by the executor (in bytes).
spark.task.time	Total time taken to execute a task across all stages (in milliseconds).
spark.task.failures	The total number of task failures in a Spark application.
spark.task.shuffle.read	The total amount of shuffle data read by tasks (in bytes).
spark.task.shuffle.write	The total amount of shuffle data written by tasks (in bytes).
spark.streaming.batchTime	The duration of each batch interval in a Spark Streaming application (in milliseconds).
spark.streaming.input.rate	The rate at which input data is received by a Spark Streaming application (in records per second).
spark.streaming.output.rate	The rate at which data is written out by a Spark Streaming application (in records per second).
spark.streaming.received.records	The total number of records received by a Spark Streaming application.
spark.streaming.processed.records	The total number of records processed by a Spark Streaming application.
spark.sql.cache.bytes	The amount of memory used by Spark SQL’s query cache (in bytes).
spark.sql.parquet.cache.bytes	The amount of memory used by cached Parquet data in Spark SQL (in bytes).
spark.sql.execution.time	The total time taken by Spark SQL queries (in milliseconds).
spark.sql.shuffle.sort.time	Time spent by Spark SQL in sorting shuffle data (in milliseconds).
spark.sql.shuffle.read.bytes	The total amount of shuffle data read by Spark SQL (in bytes).
spark.sql.shuffle.write.bytes	The total amount of shuffle data written by Spark SQL (in bytes).
spark.sql.jdbc.connections	The number of active JDBC connections in Spark SQL.
spark.sql.execution.caching	The number of times data is cached in Spark SQL queries.
spark.stage.completed	The total number of completed stages in a Spark job.
spark.stage.active	The number of stages currently running in the Spark job.
spark.stage.failed	The total number of failed stages in a Spark job.
spark.stage.taskTime	The total time taken for task execution in each stage (in milliseconds).
spark.executor.peakMemoryUsed	The peak memory usage by the executor (in bytes).
spark.executor.taskFailures	The total number of task failures in the executor.
spark.executor.pending	The number of tasks pending execution in the executor.
spark.executor.running	The number of tasks currently running in the executor.
spark.driver.memory	The memory allocated to the Spark driver (in bytes).
spark.driver.cores	The number of cores allocated to the Spark driver.
spark.application.startTime	The start time of a Spark application (in milliseconds).
spark.application.duration	The total duration of the Spark application (in milliseconds).
spark.application.endTime	The end time of a Spark application (in milliseconds).
spark.jobs.completed	The total number of completed jobs in the Spark application.
spark.jobs.failed	The total number of failed jobs in the Spark application.
spark.jobs.running	The total number of running jobs in the Spark application.
spark.jobs.pending	The total number of pending jobs in the Spark application.
spark.jobs.succeeded	The total number of successful jobs in the Spark application.
spark.rdd.cache.memoryUsed	The total amount of memory used by cached RDDs (in bytes).
spark.rdd.diskUsed	The total amount of disk space used by cached RDDs (in bytes).