Accelerating High-Performance Machine Learning at Scale in Kubernetes

Overview

Explore a hands-on guide for productionizing optimized machine learning models in cloud native ecosystems using production-ready open source frameworks in this 36-minute conference talk from KubeCon + CloudNativeCon North America 2022. Dive into a practical use case deploying the GPT-2 NLP model in Kubernetes using ONNX Runtime from the Seldon Core Triton server. Learn how to create a scalable production NLP microservice for intelligent text generation applications. Discover key challenges in the MLOps space and understand how various tools interoperate throughout the production machine learning lifecycle. Gain insights from industry experts Alejandro Saucedo and Elena Neroslavskaya on accelerating high-performance machine learning at scale in Kubernetes environments.