Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

ML Data Version Control and Reproducibility at Scale

Linux Foundation via YouTube

Overview

Explore data version control and reproducibility techniques for large-scale machine learning in this 38-minute talk by Einat Orr from Treeverse. Learn how to overcome challenges in ML data management, including reproducibility constraints and inefficient data transfer. Discover open-source tools for versioning data locally and best practices for working with data in the cloud without copying it. Gain insights into training models at scale using an OSS stack including Langchain, TensorFlow, PyTorch, and Keras. Acquire practical methods to enhance data management for developing and iterating on ML models, specifically tailored for modern computer vision research.

Syllabus

ML Data Version Control and Reproducibility at Scale - Einat Orr, Treeverse

Taught by

Linux Foundation

Reviews

Start your review of ML Data Version Control and Reproducibility at Scale

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.