Overview
Explore an in-depth analysis of the REALM (Retrieval-Augmented Language Model Pre-Training) paper in this comprehensive video lecture. Delve into the innovative approach of combining language model pre-training with a latent knowledge retriever to capture world knowledge in a modular and interpretable way. Learn about masked language modeling for latent document retrieval, the knowledge retriever model using MIPS, and the question answering model architecture. Examine the loss gradient analysis, initialization techniques, and experimental results. Gain insights into open-domain question answering and how REALM outperforms state-of-the-art models in accuracy and interpretability.
Syllabus
- Introduction & Overview
- World Knowledge in Language Models
- Masked Language Modeling for Latent Document Retrieval
- Problem Formulation
- Knowledge Retriever Model using MIPS
- Question Answering Model
- Architecture Recap
- Analysis of the Loss Gradient
- Initialization using the Inverse Cloze Task
- Prohibiting Trivial Retrievals
- Null Document
- Salient Span Masking
- My Idea on Salient Span Masking
- Experimental Results and Ablations
- Concrete Example from the Model
Taught by
Yannic Kilcher