Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to create scalable, reproducible, and deployable data science code in this EuroPython Conference talk. Learn to bridge the gap between rapid iteration and engineering reliability by leveraging three powerful open-source tools: Kedro, Apache Airflow, and Great Expectations. Explore techniques for developing modular and maintainable data science code with Kedro, orchestrating workflows using Apache Airflow with Astronomer, and ensuring data quality through Great Expectations. Gain insights into overcoming challenges associated with Jupyter notebooks, implementing declarative data management, and managing parameters and configurations. Delve into the AI lifecycle for IT production, examine different strategies for taking advantage of Kedro's extensibility, and understand the benefits of developing with Kedro while orchestrating with Airflow. By the end of this 30-minute talk, acquire the knowledge to create consensus among data scientists, data engineers, and machine learning engineers, enabling efficient collaboration in data science projects.
Syllabus
Intro
The Scenario
Different perspectives
The challenges of Jupyter notebook
Challenges with Data Management in Jupyter Notebook
Declarative Data Management in Kedro
Parameters & Configuration Management
Tradeoffs
Challenges with managing code in Jupyter Notebook
Development experience in Jupyter Notebook
Development experience in Kedro
MLOPS: THE AI LIFECYCLE FOR IT PRODUCTION
Taking advantage of Kedro's extensibility
Different strategies
Why developing with Kedro and orchestrating with Airflow?
Beyond a single project
Taught by
EuroPython Conference