Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Spark on Kubernetes - Best Practice and Performance

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Explore best practices and performance optimization techniques for running Apache Spark on Kubernetes in this 39-minute conference talk by Junjie Chen and Jerry Shao from Tencent. Learn about deploying Spark as a public cloud service using Kubernetes, covering topics such as authorization, logging, and multi-tenancy management. Discover performance tuning strategies for maximizing resource utilization, including detailed configuration adjustments for both Kubernetes and Spark. Gain insights into achieving high availability through Zookeeper integration and understand the performance impact of various configurations using TPC-DS workload benchmarks. Delve into the architecture, applications, storage services, and environments involved in Spark on Kubernetes deployments, and benefit from the speakers' real-world experiences and practical advice for optimizing big data services on containerized platforms.

Syllabus

Introduction
What is Spark
Why do we need Kubernetes
Architecture
Spark Application
Spark on accumulated status
Applications
Storage
Service
Structure
HDFS
Catalog
Highs
Environments
Benchmark Configuration
Benchmark Results
Data Locality
Our Experience
Summary

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Spark on Kubernetes - Best Practice and Performance

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.