Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Optimizing LLM Efficiency One Trace at a Time on Kubernetes

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to optimize Large Language Model (LLM) deployments on Kubernetes through a 25-minute conference talk from CNCF experts. Discover techniques for using OpenTelemetry's profiling capabilities to identify resource-intensive code segments, detect memory leaks, and prevent out-of-memory errors in LLM applications. Master the art of dynamic runtime inspection to improve model performance, reduce latency, and meet service level agreements. Gain practical insights into achieving efficient Kubernetes deployments while optimizing resource utilization and controlling costs. Explore methods for deep-level code analysis that enable precise identification of performance bottlenecks and resource drains in LLM implementations.

Syllabus

Optimizing LLM Efficiency One Trace at a Time on Kubernetes - Aditya Soni, Forrester & Seema Saharan

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Optimizing LLM Efficiency One Trace at a Time on Kubernetes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.