Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai

MLOps.community via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for deploying Large Language Models (LLMs) in production environments through this informative 12-minute talk by Dmytro Dzhulgakov, co-founder and CTO of Fireworks.ai. Learn how the Fireworks.ai GenAI Platform assists developers in navigating the complex journey from early experimentation to high-load production deployments while managing costs and latency. Gain insights into handling multiple model variants, scaling up usage, and optimizing cost-to-serve and latency concerns. Discover how Fireworks.ai's high-performance, low-cost LLM inference service can help you experiment with and productionize large models effectively. Benefit from Dzhulgakov's expertise as a PyTorch core maintainer and his experience in transitioning PyTorch from a research framework to numerous production applications across Meta's AI use cases and the broader industry.

Syllabus

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai // Dmytro Dzhulgakov

Taught by

MLOps.community

Reviews

Start your review of Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.