Completed
Summary and Future Work
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Downscaling Apache Spark Clusters - Challenges and Solutions
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Autoscaling on cloud
- 3 Upscale easy, downscale difficult
- 4 How are nodes used?
- 5 Factors affecting node downscaling
- 6 Terminology Any cluster generally comprises of following entities: • Resource Manager
- 7 Current resource allocation strategy
- 8 Example revisited with new allocation strategy
- 9 Downscale issues with Min Executors
- 10 Min executors distribution without packing
- 11 Min executors distribution with packing
- 12 How Shuffle data is produced / consumed?
- 13 External Shuffle Service
- 14 ESS at Qubole
- 15 Recap
- 16 Shuffle Cleanup • Shuffle data is deleted at the end of application by ESS
- 17 Issues with long running applications
- 18 Shuffle reuse in Spark
- 19 Downscaling a Node
- 20 Spark - Disaggregation of Compute and Storage • Mount some NFS endpoint on all the nodes of cluster • Change shuffle manager in Spark to something which can read/write shuffle from NFS mount point
- 21 Summary and Future Work