Completed
Intro
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Performance Troubleshooting Using Apache Spark Metrics - Databricks Talk
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Data at the Large Hadron Collider
- 3 Analytics Platform @CERN
- 4 Hadoop and Spark Clusters at CERN
- 5 Performance Troubleshooting Goals
- 6 Performance Methodologies and Anti-Patterns Typical benchmark graph
- 7 Workload and Performance Data
- 8 Measuring Spark
- 9 Spark Instrumentation - Metrics
- 10 How to Gather Spark Task Metrics
- 11 Spark Metrics in REST API
- 12 Task Metrics in the Event Log
- 13 SparkMeasure - Getting Started
- 14 SparkMeasure, Usage Modes
- 15 Instrument Code with Spark Measure
- 16 Spark Metrics System • Spark is also instrumented using the Dropwizard/Codahale metrics library • Multiple sources (data providers)
- 17 Ingredients for a Spark Performance Dashboard
- 18 Assemble Dashboard Components
- 19 Spark Dashboard - Examples Graph: "number of active tasks" vs. time
- 20 Dashboard - Memory
- 21 Dashboard - Executor CPU Utilization Graph: "CPU utilization by executors' JVM" vs. time
- 22 Executor Plugins Extend Metrics • User-defined executor metrics, SPARK-28091, target Spark 3.0.0
- 23 Metrics from OS Monitoring
- 24 Data + Context = Insights