Shaping the Future of AI from the History of Transformer Architectures - Stanford CS25

Overview

Explore the evolution of Transformer architectures and their impact on AI development in this insightful lecture by OpenAI research scientist Hyung Won Chung. Gain a unique perspective on the driving forces behind AI advancements, focusing on the role of exponentially cheaper compute and associated scaling. Examine the early history of Transformer architectures, understanding the motivations behind each development and how they became less relevant with increased computational power. Connect past and present AI trends to project future directions in the field. Delve into the differences between encoder-decoder and decoder-only models, and learn about the rationale for encoder-decoder's additional structures from a scaling perspective. Benefit from Chung's extensive experience in Large Language Models, including work on pre-training, instruction fine-tuning, reinforcement learning with human feedback, reasoning, and multilinguality.