Building a Production Scale, Totally Private, OSS RAG Pipeline with DBRX, Spark, and LanceDB

Overview

Discover how to construct a production-scale, fully private, open-source RAG pipeline using DBRX, Spark, and LanceDB in this informative 22-minute conference talk. Learn about the challenges enterprises face when implementing AI in production, particularly regarding data security and the need to use external services for LLMs, embedding models, and vector databases. Explore how the latest release of DBRX offers a breakthrough in open-source model quality, providing enterprises with a viable option for high-quality, self-hosted generative AI responses. Gain insights into LanceDB, an open-source solution that enables real-time serving for billion-scale embedding datasets with lower resource requirements than alternatives. Understand how LanceDB utilizes the Lance columnar format for data storage, allowing large-scale updates to be written quickly via Lance's Spark DataSource. Discover the versatility of using the same dataset for both offline analytics and online serving in LanceDB for AI retrieval in RAG, agents, and more. Learn about LanceDB's embedding function registry and its ability to target custom embedding models served from MLFlow without sending data off-premises. Explore how combining Spark, DBRX, and LanceDB enables the creation of a completely private generative AI pipeline within the lakehouse environment.

Syllabus

Building a Production Scale, Totally Private, OSS RAG Pipeline with DBRX, Spark, and LanceDB

Taught by

Databricks

Reviews

Start your review of Building a Production Scale, Totally Private, OSS RAG Pipeline with DBRX, Spark, and LanceDB

Taught by

Deploying and Maintaining RAG Systems

Open-source LLMs: Uncensored & secure AI locally with RAG

Building RAG-based LLM Applications for Production - LLMs III Talk

Azure AI Studio - Complete Guide to Azure AI Studio Copilot

Private RAG with Open Source and Custom LLMs - BentoML and OpenLLM

Customizable RAG Workflows with Your Own Data

Never Stop Learning.