Overview
Dive into a comprehensive 1 hour 43 minute video tutorial on Facebook's DETR (DEtection TRansformer) model for end-to-end object detection using transformers. Explore the model's architecture, implementation, and key components through code walkthroughs and visualizations. Learn about the DETR demo notebook, attention visualization techniques, training script analysis, backbone construction, data loading with nested tensors, forward passes through ResNet and transformer components, Hungarian matching algorithm, and loss calculation. Gain practical insights into cutting-edge object detection techniques and enhance your understanding of transformer-based architectures in computer vision tasks.
Syllabus
DETR model recap
DETR demo notebook
Visualizing attention notebook
Visualizing encoder attention
Going through the training script
Backbone construction
DETR construction
Data loading and nested tensors
Forward pass through ResNet backbone
Forward pass through the transformer
Hungarian matching algorithm
Loss calculation
Outro
Taught by
Aleksa Gordić - The AI Epiphany