Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

LinkedIn Learning

Data Engineering Pipeline Management with Apache Airflow

via LinkedIn Learning

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore ways to work with role-based access control, manage SLAs, schedule DAGs with datasets, work with Airflow Plugins, and scale Airflow.

Syllabus

Introduction
  • Features for data engineering pipeline management
1. Working with Role-Based Access Control
  • Prerequisites
  • Quick install overview
  • Creating an admin user and exploring roles
  • Creating users with different roles
  • Executing a simple branching DAG
  • Executing a simple SQL DAG
  • The public and viewer roles
  • The user role
  • The op role
  • Actions, resources, and permissions
  • Adding permissions to the public role
  • Creating and configuring a custom role
2. Managing SLAs
  • Configuring emails for SLA management
  • Configuring task-level SLAs
  • Triggering and viewing SLA misses
  • Configuring DAG-level SLAs
  • Configuring DAG failed action
3. Scheduling DAGs with Datasets
  • Dataset producer pipeline
  • Dataset consumer pipeline
  • Data-aware scheduling
  • Purchases producer pipeline and join pipeline
  • Data-aware scheduling with multiple datasets
4. Working with Airflow Plugins
  • Introducing plugins
  • Adding menu items using plugins
  • Exploring the CSV reader plugin
  • Implementing the CSV reader plugin
5. Scaling Airflow
  • Scaling Apache Airflow
  • Basic setup for the transformation pipeline
  • DAG for the transformation pipeline
  • Install RabbitMQ on macOS and Linux
  • Set up an admin user for RabbitMQ
  • Configuring the CeleryExecutor for Airflow
  • Executing tasks on a single Celery worker
  • Executing tasks on multiple Celery workers
  • Assigning tasks to queues
Conclusion
  • Summary and next steps

Taught by

Janani Ravi

Reviews

4.9 rating at LinkedIn Learning based on 22 ratings

Start your review of Data Engineering Pipeline Management with Apache Airflow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.