Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Self-Extending LLM Context Windows Using Grouped Self-Attention

Discover AI via YouTube

Overview

Learn how to extend the context length of Large Language Models (LLMs) during inference through a technical deep dive video that introduces grouped self-attention as an alternative to classical transformer self-attention mechanisms. Explore the challenges of out-of-distribution issues related to positional encoding when LLMs process text sequences beyond their pre-training context window. Examine implementation details, smooth transition techniques, and benchmark data while following along with code demonstrations based on the research paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning." Master practical solutions for handling longer sequences in neural networks without requiring model retraining or fine-tuning.

Syllabus

Introduction
Theory
Main idea
Implementation
SelfExtend LLM
Deep Dive
Smooth Transition
Benchmark Data
Publication
Code Implementation

Taught by

Discover AI

Reviews

Start your review of Self-Extending LLM Context Windows Using Grouped Self-Attention

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.