Running LLMs in the Cloud - Approaches and Best Practices
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the growing demand for running Large Language Models (LLMs) in cloud environments through this insightful keynote presentation. Delve into the urgent need for open-source LLMs among developers and enterprises, and discover best practices for deploying these models in cloud-native settings. Examine three key approaches to LLM deployment: Python-based solutions, native runtimes like llama.cpp or vLLM, and WebAssembly as an abstraction layer. Learn about the benefits and challenges of each method, with a focus on real-world applications, ease of integration, portability, and resource efficiency. Gain insights into the CNCF CNAI ecosystem landscape and receive practical advice for selecting and implementing the most suitable LLM deployment strategy for your specific requirements. This presentation aims to demystify cloud-native AI, providing attendees with a clear roadmap for deploying LLMs in the cloud and a comprehensive understanding of the strengths and trade-offs associated with different approaches.
Syllabus
Keynote: Running LLMs in the Cloud | 主旨演讲:在云上运行大语言模型 - Miley Fu, Developer Advocate, Second State
Taught by
CNCF [Cloud Native Computing Foundation]