Bringing Choice, Automation and Performance to ML Deployment with Apache TVM and the OctoML Platform

Overview

Explore a 30-minute talk on optimizing machine learning deployment using Apache TVM and OctoML Platform. Learn about graph- and operator-level optimizations for performance portability across diverse hardware back-ends. Discover how TVM's learning-based approach rapidly explores optimizations, saving engineering time and delivering top performance for edge and server use cases. Gain insights into TVM's broad model coverage and efficient hardware resource utilization. Get a preview of OctoML's Octomizer, a SaaS platform for continuous model optimization, benchmarking, and packaging. Understand the challenges of ML deployment in diverse hardware environments and how TVM and OctoML address the exploding ecosystem of ML workloads and hardware capabilities.

Syllabus

Intro
Machine Learning is hard and costly to deploy
Trend: ML workload diversity is exploding
Trend: ML hardware capabilities exploding
An exploding ecosystem makes ML deployment difficult
ML-based optimizations
Why use Apache TVM?
TVM: Getting Optimal Performance
OctoML's Broad HW & Model Architecture Coverage Octomizer supports any model architecture with standard operators see lists here and here
Thank you Apache TVM community! 615+!

Taught by

Open Data Science

Reviews

Start your review of Bringing Choice, Automation and Performance to ML Deployment with Apache TVM and the OctoML Platform

Taught by

Never Stop Learning.