Learn how to use design patterns for scalable architecture and tools such as services and containers to deploy machine learning at scale.
Overview
Syllabus
Introduction
- Scaling ML models
- What you should know
- Building and running ML models for data scientists
- Building and deploying ML models for production use
- Definition of scaling ML for production
- Overview of tools and techniques for scalable ML
- Horizontal vs. vertical scaling
- Running models as services
- APIs for ML model services
- Load balancing and clusters of servers
- Scaling horizontally with containers
- Services encapsulate ML models
- Using Plumber to create APIs for R programs
- Using Flask to create APIs for Python programs
- Best practices for API design for ML services
- Containers bundle ML model components
- Introduction to Docker
- Building Docker images with Dockerfiles
- Example Docker build process
- Using Docker registries to manage images
- Running services in clusters
- Introduction to Kubernetes
- Creating a Kubernetes cluster
- Deploying containers in a Kubernetes cluster
- Scaling up a Kubernetes cluster
- Autoscaling a Kubernetes cluster
- Monitoring service performance
- Service performance data
- Docker container monitoring
- Kubernetes monitoring
- Best practices for scaling ML
- Next steps
Taught by
Dan Sullivan