Overview
Explore a detailed analysis of BERT pruning in relation to the Lottery Ticket Hypothesis in this 54-minute video lecture. Delve into the fascinating discovery that even when many components of the giant BERT model are pruned away, it still maintains functionality. Learn about the experiments conducted to demonstrate that seemingly "bad" lottery tickets can be fine-tuned to achieve good accuracy. Gain insights into the reducibility of large Transformer-based models, the potential usefulness of most weights in pre-trained BERT, and the variations in "good" subnetworks across GLUE tasks. Follow along with the comprehensive outline covering BERT basics, the Lottery Ticket Hypothesis, paper abstract, pruning techniques, experimental results, and conclusions drawn from this groundbreaking research.
Syllabus
- Overview
- BERT
- Lottery Ticket Hypothesis
- Paper Abstract
- Pruning BERT
- Experiments
- Conclusion
Taught by
Yannic Kilcher