How to Reduce ML Computing Costs: Building Efficient Multi-Cloud Infrastructure
MLOps World: Machine Learning in Production via YouTube
Overview
Syllabus
Intro
From notebooks to training jobs
Experiment dashboard
Cluster Dashboard
Multi-Cloud ML Infrastructure
Build hybrid cluster with Kubernetes & Terraform
Terraforming AWS Infrastructure
Test AWS Infrastructure
Terraforming GCP Infrastructure
Cloud Troubleshooting
Dataset Mounting
Cluster Management & Monitoring
Common Interface
Fractional GPUs
multicluster-scheduler
reCap: Step-by-Step Guide
Taught by
MLOps World: Machine Learning in Production