Completed
Intro
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Inference and Quantization for AI - Session 3
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 OUTLINE
- 3 4-BIT QUANTIZATION
- 4 QUANTIZATION FOR INFERENCE
- 5 BINARY NEURAL NETWORKS
- 6 USING TENSOR CORES
- 7 QUANTIZED NETWORK ACCURACY
- 8 MAINTAINING SPEED AT BEST ACCURACY
- 9 SCALE-ONLY QUANTIZATION
- 10 PER-CHANNEL SCALING
- 11 TRAINING FOR QUANTIZATION
- 12 CONCLUSION
- 13 POST-TRAINING CALIBRATION
- 14 MIXED PRECISION NETWORKS
- 15 THE ROOT CAUSE
- 16 BRING YOUR OWN CALIBRATION
- 17 SUMMARY
- 18 INT PERFORMANCE
- 19 ALSO IN TensorRT
- 20 TF-TRT RELATIVE PERFORMANCE
- 21 OBJECT DETECTION - NMS
- 22 USING THE NEW NMS OP
- 23 NOW AVAILABLE ON GITHUB
- 24 TENSORRT HYPERSCALE INFERENCE PLATFORM
- 25 INEFFICIENCY LIMITS INNOVATION
- 26 NVIDIA TENSORRT INFERENCE SERVER
- 27 CURRENT FEATURES
- 28 AVAILABLE METRICS
- 29 DYNAMIC BATCHING
- 30 CONCURRENT MODEL EXECUTION-RESNET 50
- 31 NVIDIA RESEARCH AI PLAYGROUND
- 32 NV LEARN MORE AND DOWNLOAD TO USE
- 33 ADDITIONAL RESOURCES