Lies, Damned Lies, and Large Language Models - Measuring and Reducing Hallucinations

Overview

Explore the challenges and solutions surrounding large language models' (LLMs) tendency to produce incorrect information in this 29-minute PyCon US talk. Discover methods to measure and compare hallucination rates among different models, focusing on misinformation regurgitation from training data. Learn to utilize Python tools like Hugging Face's datasets and transformers packages, as well as the langchain package, to assess hallucinations using the TruthfulQA dataset. Gain insights into recent initiatives aimed at reducing hallucinations, including retrieval augmented generation (RAG), and understand how these techniques can enhance LLM reliability across various applications. Access accompanying slides for a comprehensive overview of the presentation's key points and examples.