Computer Vision Architecture Evolution: ConvNets to Transformers - Lecture 21
University of Central Florida via YouTube
Overview
Syllabus
Introduction
Evolution of Vision Architectures
Hierarchy of SWIN vs. CNNs
Modernizing ConvNets
Modernizing ResNet
Macro Design Changes
Changing stage compute ratio
Changing stem to "Patch-ify"
Depthwise Conv. vs Self-Attention
Improvements
Inverted Bottleneck
Larger Kernel Sizes
Micro Designs (mD)
Replace RELU with GELU
Fewer Activation functions
Fewer Normalization Layers
Substituting BN with LN
Visualization
mD4- Improvement
Separate Downsampling Layer
Final ConvNext block
Networks for Evaluation
Training Settings
Machine Performance Comparison
Taught by
UCF CRCV