Overview
Syllabus
Intro
Data at the Large Hadron Collider
Analytics Platform @CERN
Hadoop and Spark Clusters at CERN
Performance Troubleshooting Goals
Performance Methodologies and Anti-Patterns Typical benchmark graph
Workload and Performance Data
Measuring Spark
Spark Instrumentation - Metrics
How to Gather Spark Task Metrics
Spark Metrics in REST API
Task Metrics in the Event Log
SparkMeasure - Getting Started
SparkMeasure, Usage Modes
Instrument Code with Spark Measure
Spark Metrics System • Spark is also instrumented using the Dropwizard/Codahale metrics library • Multiple sources (data providers)
Ingredients for a Spark Performance Dashboard
Assemble Dashboard Components
Spark Dashboard - Examples Graph: "number of active tasks" vs. time
Dashboard - Memory
Dashboard - Executor CPU Utilization Graph: "CPU utilization by executors' JVM" vs. time
Executor Plugins Extend Metrics • User-defined executor metrics, SPARK-28091, target Spark 3.0.0
Metrics from OS Monitoring
Data + Context = Insights
Taught by
Databricks