Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Building Reproducible ML Processes with an Open Source Stack

Linux Foundation via YouTube

Overview

Explore the essential components for creating reproducible machine learning experiments in this 33-minute conference talk. Learn how to combine Code (KubeFlow and Git), Data (Minio+lakeFS), and Environment (Infrastructure-as-code) to ensure true reproducibility. Witness a hands-on demonstration of reproducing an experiment while maintaining the exact input data, code, and processing environment from a previous run. Discover programmatic methods to integrate all aspects, including creating commits for data snapshots, tagging, and traversing the history of both code and data simultaneously. Gain insights into overcoming the limitations of MLFlow Projects in ensuring data reproducibility for comprehensive machine learning processes.

Syllabus

Building Reproducible ML Processes with an Open Source Stack - Einat Orr, Treeverse

Taught by

Linux Foundation

Reviews

Start your review of Building Reproducible ML Processes with an Open Source Stack

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.