Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore experiment management for machine learning in this one-hour conference talk. Learn about the challenges data scientists face when designing and running experiments, including issues with reproducibility, documentation, and code dependencies. Discover best practices for documenting experiments to improve reproducibility, and learn about tools and startups addressing these challenges. Gain insights into the typical processes followed by ML practitioners and data scientists, using Python and scikit-learn as examples. Understand the importance of robust, reproducible code, modular design, automated testing, and proper documentation in machine learning workflows. Presented by Dr. Rutu Mulkar, founder of Hunchera and former contributor to IBM's Watson system, this talk offers valuable insights for improving experiment management in machine learning projects.
Syllabus
Introduction
About me
Agenda
Machine Learning Pipeline
Machine Learning as a Human
Record Experiments
Tracking Experiments
Reproducibility
Large File Storage
Issues with File Storage
Common Problems
Reproducible
Machine Learning vs Software Engineering
Machine Learning vs Feature Engineering
Write Robust Reproducible Code
Make Code Modular
Automated Testing
Python Tests
Virtual Environment
Documentation
Whats next
Experiment Management
incumbents
if I go
best practices
Taught by
Data Science Dojo