Overview
Discover how to deploy fast and accurate YOLOv8 object detection models on CPUs in this 48-minute webinar recording from Neural Magic. Learn about state-of-the-art sparsification techniques, including pruning and quantization, that result in 10x smaller and 8x faster models with minimal accuracy loss. Explore topics such as GPU performance, baseline results, DeepSparse Engine, sparsity quantization, and sparse transfer learning. Gain insights into performance comparisons, open-source repositories, and upcoming features. Get practical guidance on implementing these optimizations for computer vision use cases to achieve best-in-class inference performance on existing CPUs. Understand the benefits of sparse ML and how to get started with free resources available on the Neural Magic website.
Syllabus
Introduction
Why YOLOv8
GPU Performance
Baseline Results
DeepSparse Engine
Sparsity Quantization
Sparsity Profile Generation
Quantization
Results
Should you deploy YOLOv8
Performance comparison
Getting started
Open source repository
Sparse transfer
Sparse transfer learning
Sparse ML
Upcoming features
Questions
Sparse ML Recipe
Server ARM
YOLOv8 vs Openvino
Licensing
Conclusion
Taught by
Neural Magic