Lies, Damned Lies, and Large Language Models - Measuring and Reducing Hallucinations

Overview

Explore the challenges and solutions surrounding large language models' (LLMs) tendency to produce incorrect information in this 29-minute PyCon US talk. Discover methods to measure and compare hallucination rates among different models, focusing on misinformation regurgitation from training data. Learn to utilize Python tools like Hugging Face's datasets and transformers packages, as well as the langchain package, to assess hallucinations using the TruthfulQA dataset. Gain insights into recent initiatives aimed at reducing LLM hallucinations, including retrieval augmented generation (RAG) techniques, and understand how these approaches can enhance the reliability and usability of LLMs across various contexts.