Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore practical considerations for building private Retrieval-Augmented Generation (RAG) applications using open source and custom LLMs in this informative talk by Chaoyu Yang, Founder and CEO at BentoML. Discover the benefits of self-hosting open source LLMs or embedding models for RAG, learn common best practices for optimizing inference performance, and understand how BentoML can be used to build RAG as a service. Gain insights into seamlessly chaining language models with various components, including text and multi-modal embedding, OCR pipelines, semantic chunking, classification models, and reranking models. Additionally, learn about OpenLLM and its role in LLM deployments. This 51-minute session, presented by LLMOps Space, offers valuable knowledge for practitioners interested in deploying LLMs into production.
Syllabus
Private RAG with Open Source and Custom LLMs | BentoML | OpenLLM
Taught by
LLMOps Space