Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How Do Transformers Work? - A Deep Dive into Neural Network Architecture

Simons Institute via YouTube

Overview

Learn about the fundamental mechanisms behind transformer architectures in this technical lecture from MIT professor Ankur Moitra, delivered as part of the Simons Institute's Special Year on Large Language Models and Transformers Boot Camp. Dive deep into the inner workings of transformer models, exploring their key components, architectural design principles, and the mathematical foundations that make them so effective for natural language processing tasks. Gain valuable insights into attention mechanisms, positional encodings, and the overall structure that has made transformers the backbone of modern language models.

Syllabus

How Do Transformers Work?

Taught by

Simons Institute

Reviews

Start your review of How Do Transformers Work? - A Deep Dive into Neural Network Architecture

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.