Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scaling MLOps to Retrain 50k Weekly Models in Parallel Using UDFs

Databricks via YouTube

Overview

Discover how data.ai's machine learning team leverages the Databricks Platform to implement MLOps best practices for high-frequency retraining in this 32-minute conference talk. Learn about the framework created to incorporate MLOps into weekly retraining for approximately 50,000 sklearn models in parallel. Explore how Pandas UDFs can be used to apply arbitrary code in groups, enabling MLflow logging and model registration at scale for any grouped data. Gain insights into the challenges of parallelizing model training across multiple categories and countries, and understand the limitations of this approach. Consider how this methodology could be adapted for more time-sensitive use cases. Presented by Kaleb Lowe, Staff Machine Learning Engineer at Data.AI, this talk offers valuable insights for data scientists and machine learning engineers working on large-scale model retraining projects.

Syllabus

Scaling MLOps to Retrain 50k Weekly Models in Parallel Using UDFs.

Taught by

Databricks

Reviews

Start your review of Scaling MLOps to Retrain 50k Weekly Models in Parallel Using UDFs

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.