Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Scaling Kubeflow for Multi-tenancy at Spotify

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Explore Spotify's journey in scaling Kubeflow for multi-tenancy in this conference talk. Learn how the company addressed challenges of increased adoption and complex machine learning experiments while ensuring cluster reliability and equitable resource access. Discover Spotify's streamlined tooling for maintaining, deploying, and monitoring their Kubeflow distribution. Gain insights into their multi-cluster approach, team-based multi-tenancy strategies, and implementation of infrastructure-as-code. Understand how they tackled new challenges using ArgoCD, improved observability, and expanded on-cluster compute capabilities. Delve into their focus on SLO tracking, telemetry, and metrics, as well as their efforts to enhance product identity and promote self-service. Get a glimpse of Spotify's future plans for their Kubeflow platform and their commitment to open-source contributions in the field of machine learning infrastructure.

Syllabus

Intro
Kubeflow Platform for ML
ML Workflow @ Spotify
Kubeflow clusters
Kubeflow platform stats
Our solutions
Team based multi-tenancy
Team profile management
Team profile example
Existing setup & problems
Multi-cluster strategy illustration
Benefits of multiple clusters
Multi-cluster implementation
Multi-cluster based Kubeflow Platforr
New challenges
Solution: ArgoCD
Multi-cluster CD
Multi-cluster reliability
"Infrastructure"-as-code
SLO Tracking
Telemetry and Metrics
kubeflow-state-metrics
Infrastructure and Infrastructui
Product Identity
Expanded Observability
Expanded "On-Cluster" Compute
Self-Service
Open Source!

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Scaling Kubeflow for Multi-tenancy at Spotify

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.