Inference and Quantization for AI - Session 3

Inference and Quantization for AI - Session 3

NVIDIA Developer via YouTube Direct link

Intro

1 of 33

1 of 33

Intro

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Inference and Quantization for AI - Session 3

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 OUTLINE
  3. 3 4-BIT QUANTIZATION
  4. 4 QUANTIZATION FOR INFERENCE
  5. 5 BINARY NEURAL NETWORKS
  6. 6 USING TENSOR CORES
  7. 7 QUANTIZED NETWORK ACCURACY
  8. 8 MAINTAINING SPEED AT BEST ACCURACY
  9. 9 SCALE-ONLY QUANTIZATION
  10. 10 PER-CHANNEL SCALING
  11. 11 TRAINING FOR QUANTIZATION
  12. 12 CONCLUSION
  13. 13 POST-TRAINING CALIBRATION
  14. 14 MIXED PRECISION NETWORKS
  15. 15 THE ROOT CAUSE
  16. 16 BRING YOUR OWN CALIBRATION
  17. 17 SUMMARY
  18. 18 INT PERFORMANCE
  19. 19 ALSO IN TensorRT
  20. 20 TF-TRT RELATIVE PERFORMANCE
  21. 21 OBJECT DETECTION - NMS
  22. 22 USING THE NEW NMS OP
  23. 23 NOW AVAILABLE ON GITHUB
  24. 24 TENSORRT HYPERSCALE INFERENCE PLATFORM
  25. 25 INEFFICIENCY LIMITS INNOVATION
  26. 26 NVIDIA TENSORRT INFERENCE SERVER
  27. 27 CURRENT FEATURES
  28. 28 AVAILABLE METRICS
  29. 29 DYNAMIC BATCHING
  30. 30 CONCURRENT MODEL EXECUTION-RESNET 50
  31. 31 NVIDIA RESEARCH AI PLAYGROUND
  32. 32 NV LEARN MORE AND DOWNLOAD TO USE
  33. 33 ADDITIONAL RESOURCES

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.