Completed
Intro
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Vision Transformers Explained + Fine-Tuning in Python
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 In this video
- 3 What are transformers and attention?
- 4 Attention explained simply
- 5 Attention used in CNNs
- 6 Transformers and attention
- 7 What vision transformer ViT does differently
- 8 Images to patch embeddings
- 9 1. Building image patches
- 10 2. Linear projection
- 11 3. Learnable class embedding
- 12 4. Adding positional embeddings
- 13 ViT implementation in python with Hugging Face
- 14 Packages, dataset, and Colab GPU
- 15 Initialize Hugging Face ViT Feature Extractor
- 16 Hugging Face Trainer setup
- 17 Training and CUDA device error
- 18 Evaluation and classification predictions with ViT
- 19 Final thoughts