Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Understanding Grokking - A New Performance Phase in Large Language Models

Discover AI via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Watch an illuminating 30-minute video exploring the groundbreaking phenomenon of "grokking" in Large Language Models (LLMs) and its implications for AI development. Delve into how transformers exhibit unexpected performance improvements after extended training periods, challenging traditional assumptions about overfitting. Examine detailed analyses of embedding space patterns in transformer models, including geometric structures like circles and parallelograms that emerge during arithmetic operations. Learn how this discovery, supported by MIT research and multiple studies, demonstrates that prolonged training enables models to internalize mathematical rules and achieve superior generalization without requiring retrieval augmentation or complex prompting. Understand the practical significance of these findings for developing more capable AI systems, backed by research from papers including "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets" and "Towards Understanding Grokking: An Effective Theory of Representation Learning."

Syllabus

New Discovery: LLMs have a Performance Phase

Taught by

Discover AI

Reviews

Start your review of Understanding Grokking - A New Performance Phase in Large Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.