Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to successfully deploy AI and machine learning models to production in this conference talk from ODSC East 2018. Explore best practices, necessary tech stacks, and organizational rhythms for achieving ROI on AI investments. Discover solutions to infrastructure and scaling challenges when dealing with thousands of model versions across various frameworks. Gain insights into maintaining low latency, managing GPU memory, load balancing, and orchestrating complex AI systems. Understand the evolution of online learning systems and the importance of discoverability layers in AI deployment. Ideal for engineers and leadership looking to bridge the gap between AI research and practical implementation.
Syllabus
Introduction
The problem
The solution
Operating systems
Generalpurpose computing
AI evolution
Use case
Characteristics
Dev tool chain
Training vs Inference
Containers
Paradigms
Cost Efficiency
Design for maximum load
Hourly model
Concurrency
Latency
GPU Memory Management
Load Balancing
composability
orchestration
runtime abstractions
cloud abstraction
runtime characteristics
app store model
summary
evolution of OLS
Discoverability layer
Code stubs
Model development
Conclusion
Taught by
Open Data Science