Overview
Learn about detecting and evaluating hallucinations in Large Language Models (LLMs) through this 13-minute technical video that explores root causes and mitigation strategies in RAG systems. Dive into essential tools like RAGAS metrics for measuring faithfulness, context relevance, and answer accuracy. Explore advanced techniques including LLMs as evaluation judges and embedding models for hallucination detection. Gain practical insights into the Lynx model, a specialized Llama-3 variant designed to identify and reduce hallucinations. Follow along with detailed explanations of evaluation methodologies, from embedding-based approaches to comprehensive RAGAS framework implementation, complete with real-world examples and practical applications for improving RAG pipeline accuracy. Access supplementary materials through the provided PDF download for deeper understanding of these critical concepts in LLM development and deployment.
Syllabus
- Overview of Contents
- Hallucinations Root Cause
- RAG Pipelines
- Faithfulness / Groundedness
- RAGAS Metrics
- Tools Embeddings, LLM-as-Judge
- Evaluating Faithfulness with Embeddings
- Evaluating Faithfulness with LLM-as-Judge Lynx
- Evaluating Faithfulness with RAGAS
- Evaluating Answer Relevance
- Evaluating Context Relevance
- How to use these metrics?
- Summary
Taught by
Donato Capitella