Improving Complex RAG Systems and Achieving No-Regret, Lightning-Fast Deployment Iterations of LLMs
Data Science Festival via YouTube
Overview
Explore advanced techniques for enhancing complex Retrieval Augmented Generation (RAG) systems and implementing rapid, risk-free deployment iterations of Large Language Models (LLMs) in this 33-minute talk from the Data Science Festival. Delve into strategies to overcome LLM limitations such as outdated training data, restricted context windows, latency, and API rate limits. Learn how to leverage AWS lambda aliases for deploying "shadow" versions and feature flagging beta releases, enabling safe and swift iterations of LLM-based applications. Discover methods to evaluate system performance in production using unique data gathered from these deployment techniques. Gain insights applicable to ML/software engineers and data scientists, with a focus on practical coding experience in LLMs, RAG, and AWS lambdas or equivalent technologies.
Syllabus
Improving complex RAG systems and achieving no regret lightning fast deployment iterations of LLMs
Taught by
Data Science Festival