Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

MiniLLM: Knowledge Distillation of Large Language Models

Unify via YouTube

Overview

Explore a 52-minute presentation on Knowledge Distillation of Large Language Models by Yuxian Gu, a PhD student at Tsinghua University. Delve into a novel method replacing forward Kullback-Leibler divergence with reverse KLD in standard knowledge distillation approaches for large language models. Discover how this technique prevents student models from overestimating low-probability regions of teacher distributions, resulting in MiniLLMs that generate more precise and higher-quality responses compared to traditional knowledge distillation baselines. Learn about the research paper, its authors, and related resources. Gain insights into AI optimization, language models, and cutting-edge knowledge distillation techniques. Access additional materials including AI research trends, deployment strategies, and connect with the Unify community through various platforms.

Syllabus

MiniLLM: Knowledge Distillation of Large Language Models

Taught by

Unify

Reviews

Start your review of MiniLLM: Knowledge Distillation of Large Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.