Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

HDFS CSI Plugin: Speeding Up Kubernetes in On-Premises Big Data Clusters

Linux Foundation via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the integration of Kubernetes with on-premises big data clusters through this conference talk. Learn about the HDFS CSI Plugin design and architecture, addressing the challenge of consuming HDFS data with Kubernetes. Discover best practices for running Spark workloads on Kubernetes with HDFS access using the CSI plugin. Examine performance comparisons between Spark on Kubernetes with HDFS and Spark on YARN with HDFS using the TPC-DS benchmark suite. Gain insights into big data history, containerization benefits, Kubernetes architecture, CSI core services, volume lifecycle management, and Hadoop HDFS characteristics as persistent volumes. Understand the potential of Kubernetes as an alternative to Hadoop YARN for resource scheduling in on-premises big data environments.

Syllabus

Intro
Outline
Big Data History Cont.
Big Data Stack
Big Data Trend
Benefit of Containerization
Kubernetes Architecture
Challenges
CSI(Container Storage Interface)
CSI Core Services
CSI Advance Features
Volume Lifecycle Volume Lifecycle
Controller and Node Services
Kubernetes Storages
Kubernetes CSI Support
PV, PVC and Storage Class
Package and Deployment Suggestion
Hadoop HDFS
HDFS Cluster Scale
Apache Ozone
HDFS/Ozone as PV
HDFS Characteristics as PV
HDFS NFS Gateway CSI
Ozone CSI
Resources

Taught by

Linux Foundation

Reviews

Start your review of HDFS CSI Plugin: Speeding Up Kubernetes in On-Premises Big Data Clusters

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.