Is There a Place For Distributed Storage For AI - ML on Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the potential of distributed storage for AI/ML workloads on Kubernetes in this 34-minute conference talk by Diane Feddema and Kyle Bader from Red Hat. Discover the benefits and challenges of containerized machine learning workloads, including portability, declarative configuration, and reduced administrative toil. Learn about the performance trade-offs between local and open-source distributed storage solutions, and gain insights into running MLPerf training jobs in Kubernetes environments. Examine the utility of machine learning formats like RecordIO and TFRecord for performance optimization and model validation flexibility. Dive into topics such as the machine learning lifecycle, object detection segmentation, COCO datasets, GPU utilization, Python, PyTorch, and benchmark preparation techniques.
Syllabus
Intro
Why Distributed Storage
Machine Learning Life Cycle
MLperf
Object Detection Segmentation
COCO Data Set
COCO Explorer
Why GPUs
Python
PyTorch
Preparing Benchmarks
First Benchmark
Second Benchmark
Taught by
CNCF [Cloud Native Computing Foundation]