Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scaling LLM Test-Time Compute Optimally for Improved Performance

Yannic Kilcher via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive analysis of scaling inference-time computation in Large Language Models (LLMs) through this in-depth video presentation. Delve into the research paper that investigates how LLMs can improve their performance by utilizing additional test-time computation. Examine two primary mechanisms for scaling test-time computation: searching against dense, process-based verifier reward models and updating the model's distribution over a response adaptively. Discover how the effectiveness of different approaches varies depending on prompt difficulty, leading to the development of a "compute-optimal" scaling strategy. Learn how this strategy can improve test-time compute efficiency by more than 4x compared to a best-of-N baseline. Gain insights into the implications of these findings for LLM pretraining and the trade-offs between inference-time and pre-training compute. Understand how, in certain scenarios, test-time compute can be leveraged to outperform significantly larger models in a FLOPs-matched evaluation.

Syllabus

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)

Taught by

Yannic Kilcher

Reviews

Start your review of Scaling LLM Test-Time Compute Optimally for Improved Performance

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.