Production-Ready AI Platform on Kubernetes

Overview

Explore the challenges and best practices of building large-scale, efficient, and reliable AI/ML platforms using cloud-native technologies in this 39-minute conference talk by Yuan Tang from Red Hat. Dive into the complexities of designing data science and machine learning applications, addressing the challenges posed by diverse ML frameworks, hardware accelerators, and cloud vendors. Learn about constructing inference systems suitable for models of various sizes, including Large Language Models (LLMs). Gain insights into leveraging Kubernetes, Kubeflow, and KServe to create a reference platform for modern cloud-native AI infrastructure. Discover how to overcome MLOps challenges and optimize your AI/ML workflows for production environments.