Tackling Scaling Challenges of Apache Spark at LinkedIn - Infrastructure Optimization and User Productivity
Databricks via YouTube
Overview
Syllabus
Intro
Challenges of Scaling Spark
Tackling Scaling Challenges
Typical Spark User Questions
Automatic Failure Root Cause Analysis
Platform Failure Reason Breakdown
Grid Bench - Performance Analysis
Tuning Heuistics & Recommendations
Scaling Spark History Server
A Low-Latency Solution
Issues with Spark Shuffle Service
Next-gen Spark shuffle service
Push-Merge Shuffle
Fetch Merged Shuffle Data
Magnet Shuffle Service Recap
Takeaways
Taught by
Databricks