Elephant on Wheels - Petabyte-scale AI @ LinkedIn
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Explore how LinkedIn leverages Kubernetes for petabyte-scale AI workloads in this conference talk. Discover the journey from proof of concept for Jupyter notebooks to a key infrastructure for model training and serving. Learn about Kube2Hadoop, a scalable and secure open-source integration that allows AI and non-AI workloads to access HDFS securely. Gain insights into how LinkedIn's AI modelers use data securely in model exploration and training with KubeFlow components. Understand the prototyping of a multilevel scheduler on top of Kubernetes and YARN clusters in the cloud, enabling intelligent job routing and facilitating workflows across different cluster types. Dive into topics such as Hadoop vs Kubernetes, hadoop delegation token, cube2hadoop components and workflow, authentication mechanisms, and future improvements. Witness a demo showcasing the practical application of these concepts in LinkedIn's AI infrastructure.
Syllabus
Intro
Hadoop vs Kubernetes
hadoop delegation token
cube2hadoop
What is cube2hadoop
Cube2hadoop components
Cube2hadoop workflow
Authentication mechanism
ID decorator
Multiple crds
Why cube tohadoop
Customer feedback
Future improvements
Demo
Taught by
CNCF [Cloud Native Computing Foundation]