Understanding Retrieval Heads in Large Language Models - From Discovery to Applications
Discover AI via YouTube
Overview
Syllabus
Intro Green grasshoppers
What do attention heads focus on?
Long context Factuality by retrieval heads
Needle in a Haystack Benchmark
How many retrieval heads in a LLM?
What is a retrieval head?
Retrieval heatmap consistent with pre-trained base model
Retrieval heads and Chain-of-Thought Reasoning
Retrieval heads explain why LLMs hallucinate
How to generate more retrieval heads in LLMs?
Taught by
Discover AI