Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Lessons Learned from the Migration to Apache Airflow

Linux Foundation via YouTube

Overview

Discover key insights from migrating machine learning and big data processing pipelines to Apache Airflow in this 38-minute conference talk. Explore how Skimlinks leverages Airflow to power their big data infrastructure, analyzing hundreds of terabytes of data. Learn about building ETL pipelines and managing machine learning Spark pipeline workflows using Airflow. Gain understanding of basic Airflow concepts and see real-life examples of defining workflows in Python code. Delve into advanced topics such as custom task operators, sensors, and plugins. Examine best practices, pros and cons of the tool, and implementation in Docker and Kubernetes environments. Understand the concept of Directed Acyclic Graphs (DAGs) and their importance in creating idempotent workflows.

Syllabus

Intro
Lessons learned from the migration to Apache Airflow
Agenda
Skimlinks: What we do
Why Airflow?
Data Architecture Overv
Airflow and Spark
DAG: Directed Acyclic Graph
Operator
Advanced Features
Sample code
Idempotent DAGS
Best practices: Docker and Kubernetes environments
Airflow: The Good, the Bad and the Ugly

Taught by

Linux Foundation

Reviews

Start your review of Lessons Learned from the Migration to Apache Airflow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.