AI Inference Workloads - Solving MLOps Challenges in Production

AI Inference Workloads - Solving MLOps Challenges in Production

Toronto Machine Learning Series (TMLS) via YouTube Direct link

Agenda

2 of 17

2 of 17

Agenda

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

AI Inference Workloads - Solving MLOps Challenges in Production

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Agenda
  3. 3 The Machine Learning Process
  4. 4 Deployment Types for Inference Workloads
  5. 5 Machine Learning is Different than Traditional Software Engineering
  6. 6 Low Latency
  7. 7 High Throughput
  8. 8 Maximize GPU Utilization
  9. 9 Embedding ML. Models into Web Servers
  10. 10 Decouple Web Serving and Model Serving
  11. 11 Model Serving System on Kubernetes
  12. 12 Multi-Instance GPU (MIG)
  13. 13 Run:Al's Dynamic MIG Allocations
  14. 14 Run 3 instances of type 2g.10gb
  15. 15 Valid Profiles & Configurations
  16. 16 Serving on Fractional GPUs
  17. 17 A Game Changer for Model Inferencing

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.