Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Unlocking the Potential of Large Models in Production - Best Practices and Solutions

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Learn about the challenges and solutions for deploying large language models (LLMs) in production environments through this conference talk presented by Yuan Tang from Red Hat and Adam Tetelman from NVIDIA. Explore best practices for building scalable inference platforms using cloud native technologies like Kubernetes, Kubeflow, KServe, and Knative. Discover practical solutions for benchmarking LLMs, implementing efficient storage and caching mechanisms for quick auto-scaling, optimizing models for specialized accelerators, managing A/B testing with limited compute resources, and establishing effective monitoring systems. Using KServe as a case study, gain insights into addressing critical LLMOps challenges that arise during the transition from traditional machine learning to generative AI and large language models in production environments.

Syllabus

Unlocking Potential of Large Models in Production - Yuan Tang, Red Hat & Adam Tetelman, NVIDIA

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Unlocking the Potential of Large Models in Production - Best Practices and Solutions

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.