Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How to Make Apache Spark on Kubernetes Run Reliably on Spot Instances

Databricks via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to optimize Apache Spark on Kubernetes using spot instances in this 33-minute Databricks conference talk. Learn concrete guidelines and code examples for running Spark reliably on spot VMs, which can provide up to 90% cost savings. Explore key topics such as using spot nodes for Spark executors, mixing instance types and sizes to reduce interruption risks, and leveraging cluster autoscaling. Gain insights into Spark 3.0's graceful decommissioning feature for preserving shuffle files on executor shutdown, and Spark 3.1's PVC reuse on executor restart for disaggregating compute and shuffle storage. Understand the evolution of Spark on Kubernetes, including its architecture, benefits, and comparison to Spark on YARN. Examine real-world experiments demonstrating the impact of spot instances and graceful executor decommissioning. Stay informed about upcoming features in future Spark releases to enhance your data processing capabilities on Kubernetes.

Syllabus

Intro
Kubernetes is a new cluster manager for Spark
The Spark on Kubernetes Journey
Spark on YARN: architecture & pain points
Spark on Kubernetes: architecture & benefits
Our background - Ocean for Apache Spark
Spot instances
How does Spark cope with spot interruptions?
Best practice: run driver OD, execs on Spot
This is how your cluster may look like
Limitation: Avoid cross-Az data transfer
We ran an experiment to measure the impact
Experiment results
Since Spark 3.1: Graceful Exec Decommissioning
Spark 3.1 - Graceful Exec Decommissioning
Graceful Exec Decommissioning - Experiment
Since Spark 3.2: Executor PVC Reuse
What's new in Spark 3.3 for Spark-on-kes
DATA+AI SUMMIT 2022

Taught by

Databricks

Reviews

Start your review of How to Make Apache Spark on Kubernetes Run Reliably on Spot Instances

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.