Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Transformers, Parallel Computation, and Logarithmic Depth

Simons Institute via YouTube

Overview

Explore the computational power of transformers in this 57-minute lecture by Daniel Hsu from Columbia University. Delve into the relationship between self-attention layers and communication rounds in Massively Parallel Computation. Discover how logarithmic depth enables transformers to efficiently solve complex computational tasks that challenge other neural sequence models and sub-quadratic transformer approximations. Gain insights into parallelism as a crucial distinguishing feature of transformers. Learn about the collaborative research with Clayton Sanford from Google and Matus Telgarsky from NYU, focusing on the simulation capabilities between constant numbers of self-attention layers and communication rounds in Massively Parallel Computation.

Syllabus

Transformers, parallel computation, and logarithmic depth

Taught by

Simons Institute

Reviews

Start your review of Transformers, Parallel Computation, and Logarithmic Depth

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.