Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Representational Strengths and Limitations of Transformers

Google TechTalks via YouTube

Overview

Explore the mathematical foundations of attention layers in transformers through this Google TechTalk presented by Clayton Sanford. Delve into both positive and negative results regarding the representation power of attention layers, focusing on intrinsic complexity parameters such as width, depth, and embedding dimension. Discover how transformers outperform recurrent and feedforward networks in a sparse averaging task, scaling logarithmically rather than polynomially with input size. Examine the limitations of attention layers in a triple detection task, where complexity scales linearly with input size. Learn about the application of communication complexity in transformer analysis and gain insights into the representational properties and inductive biases of neural networks. Presented by Clayton Sanford, a PhD student at Columbia studying machine learning theory, this talk also touches on his work in solving learning combinatorial algorithms with transformers and climate modeling using machine learning.

Syllabus

Representational Strengths and Limitations of Transformers

Taught by

Google TechTalks

Reviews

Start your review of Representational Strengths and Limitations of Transformers

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.