Completed
Intro
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Visual Question Answering: Grounded Systems and Transformer Capsules
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Grounded Visual Question Answering
- 3 Limitations of Existing VQA Systems
- 4 Grounded VQA Systems
- 5 Problem Setup
- 6 Transformers with Capsules
- 7 Approach
- 8 Capsule-based Tokens
- 9 Input to Intermediate Transformer layers
- 10 Text-based Residual Connection
- 11 Pre-training Tasks
- 12 Masked Language Modeling (MLM)
- 13 Image Text Matching
- 14 Pre-training Datasets
- 15 Fine-tuning on Downstream Task
- 16 Qualitative comparison - GQA
- 17 Evaluation Metrics
- 18 Results - GQA
- 19 Conclusion and Future Work