Vision Transformers Explained + Fine-Tuning in Python

Vision Transformers Explained + Fine-Tuning in Python

James Briggs via YouTube Direct link

Intro

1 of 19

1 of 19

Intro

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Vision Transformers Explained + Fine-Tuning in Python

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 In this video
  3. 3 What are transformers and attention?
  4. 4 Attention explained simply
  5. 5 Attention used in CNNs
  6. 6 Transformers and attention
  7. 7 What vision transformer ViT does differently
  8. 8 Images to patch embeddings
  9. 9 1. Building image patches
  10. 10 2. Linear projection
  11. 11 3. Learnable class embedding
  12. 12 4. Adding positional embeddings
  13. 13 ViT implementation in python with Hugging Face
  14. 14 Packages, dataset, and Colab GPU
  15. 15 Initialize Hugging Face ViT Feature Extractor
  16. 16 Hugging Face Trainer setup
  17. 17 Training and CUDA device error
  18. 18 Evaluation and classification predictions with ViT
  19. 19 Final thoughts

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.