Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Blockwise Parallel Decoding for Deep Autoregressive Models

Yannic Kilcher via YouTube

Overview

Explore a novel blockwise parallel decoding scheme for deep autoregressive sequence-to-sequence models in this informative video. Learn how this approach allows for substantial improvements in generation speed when applied to architectures that can process output sequences in parallel. Discover the empirical verification of this method through experiments using state-of-the-art self-attention models for machine translation and image super-resolution. Understand how the proposed technique achieves iteration reductions of up to 2x over baseline greedy decoders without quality loss, or up to 7x with a slight performance decrease. Examine the real-time speedups of up to 4x over standard greedy decoding in terms of wall-clock time. Gain insights into the trade-offs between computation needed per layer and critical path length at training time for different architecture classes such as recurrent, convolutional, and self-attention networks.

Syllabus

Blockwise Parallel Decoding for Deep Autoregressive Models

Taught by

Yannic Kilcher

Reviews

Start your review of Blockwise Parallel Decoding for Deep Autoregressive Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.