Self-Hosted LLMs on Kubernetes - A Practical Guide

Overview

Explore the practical aspects of deploying self-hosted Large Language Models (LLMs) on Kubernetes in this informative 34-minute conference talk presented by Hema Veeradhi and Aakanksha Duggal from Red Hat. Learn how to overcome the complexities of deploying and managing LLMs in production environments, making it accessible for beginners to start their LLM journey. Discover the process of selecting appropriate open source LLM models, containerizing them, and creating Kubernetes deployment manifests. Gain insights into resource provisioning to support the computational needs of LLMs. Understand the benefits of self-hosted LLMs, including enhanced data privacy, flexibility in model training, and reduced operational costs. By the end of this talk, acquire the necessary skills and knowledge to navigate the path of self-hosting LLMs, empowering organizations to gain greater control over their AI infrastructure.