Making LLMs Fully Utilize Context - A Data-Driven Approach

Overview

Explore a 14-minute video explanation of Microsoft's research paper on improving Large Language Models' context utilization through data-driven solutions, contrasting with Google's architectural approach in the infini-attention paper. Learn about the 'Lost in the Middle Challenge,' Information Intensive Training (IN2), and Various Long Context Probing (VAL) methodologies. Dive into mathematical representations, training settings, experimental results, and real-world performance data that demonstrate how LLMs can better process and utilize extended context. Presented by an experienced Machine Learning researcher with 15 years of software engineering background, the video breaks down complex concepts into digestible segments, complete with detailed timestamps for easy navigation through specific topics.

Syllabus

- Intro
- Lost in the Middle Challenge in Context
- Related work in Long Context LLMs
- Information Intensive Training IN2 Training
- Fine-grained Information awareness
- Integration and Reasoning of Information
- Mathematical Representation
- Trainnig setting/Details
- VArious Long Context Probing VAL Probing
- Needle in a Haystack for Long Context LLMs
- Experimental Results
- Quantitative Results
- Real-world data performance
- Summary and Extro

Taught by

AI Bites

Reviews

Start your review of Making LLMs Fully Utilize Context - A Data-Driven Approach

Taught by

INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

Understanding Retrieval Heads in Large Language Models - From Discovery to Applications

LLMs and Transformers Demystified: Introduction to AI Engineering - Lecture 1

How to Code Long-Context LLMs - LongLoRA Implementation with Llama 2 100K

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.