Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Stanford University

Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3

Stanford University via YouTube

Overview

Explore the evolution of language models in this Stanford seminar, tracing the development from early 3-Gram models to advanced GPT-3 systems. Delve into the architecture of Transformers, their applications in unsupervised learning, and their impact on various natural language processing tasks. Examine zero-shot and few-shot learning capabilities, and investigate the extension of GPT models to image processing and code generation. Gain insights into the HumanEval dataset, the Pass@K metric, and the challenges faced in developing these powerful language models. Conclude with a discussion on the limitations and future directions of GPT technology.

Syllabus

Introduction.
3-Gram Model (Shannon 1951).
Recurrent Neural Nets (Sutskever et al 2011).
Big LSTM (Jozefowicz et al 2016).
Transformer (Llu and Saleh et al 2018).
GPT-2: Big Transformer (Radford et al 2019).
GPT-3: Very Big Transformer (Brown et al 2019).
GPT-3: Can Humans Detect Generated News Articles?.
Why Unsupervised Learning?.
Is there a Big Trove of Unlabeled Data?.
Why Use Autoregressive Generative Models for Unsupervised Learnin.
Unsupervised Sentiment Neuron (Radford et al 2017).
Radford et al 2018).
Zero-Shot Reading Comprehension.
GPT-2: Zero-Shot Translation.
Language Model Metalearning.
GPT-3: Few Shot Arithmetic.
GPT-3: Few Shot Word Unscrambling.
GPT-3: General Few Shot Learning.
IGPT (Chen et al 2020): Can we apply GPT to images?.
IGPT: Completions.
IGPT: Feature Learning.
Isn't Code Just Another Modality?.
The HumanEval Dataset.
The Pass @ K Metric.
Codex: Training Details.
An Easy Human Eval Problem (pass@1 -0.9).
A Medium HumanEval Problem (pass@1 -0.17).
A Hard HumanEval Problem (pass@1 -0.005).
Calibrating Sampling Temperature for Pass@k.
The Unreasonable Effectiveness of Sampling.
Can We Approximate Sampling Against an Oracle?.
Main Figure.
Limitations.
Conclusion.
Acknowledgements.

Taught by

Stanford Online

Reviews

Start your review of Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.