Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of building scalable and distributed machine learning data pipelines using Python and Airflow in this 28-minute conference talk from EuroPython 2018. Gain key insights from Alejandro Saucedo's career experiences in deploying machine learning systems across critical environments in various sectors. Dive deep into Airflow's fundamentals, comparing it with alternatives like Luigi, Chronos, and Pinball. Learn how to address challenges in scaling machine learning systems using a manager-worker-queue architecture with Celery for distributed processing. Discover practical examples, caveats, and workarounds for complex use-cases, equipping you with the knowledge to construct industry-ready machine learning pipelines for large-scale data processing. The talk covers topics such as early crypto machine learning, linear regression pipelines, Celery integration, and Airflow implementation, providing a comprehensive overview of industrial machine learning pipelines.
Syllabus
Intro
Early Crypto ML
ML Pipelines
Linear Regression Pipeline
Celery
Data Pipelines
Airflow
Taught by
EuroPython Conference