Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Building a Robust Data Pipeline with the DAG Stack - dbt, Airflow, and Great Expectations

Open Data Science via YouTube

Overview

Explore the "dag Stack" - a robust data pipeline solution combining dbt, Airflow, and Great Expectations. Learn how to build a transformation layer with dbt, validate source data and add complex tests using Great Expectations, and orchestrate the entire pipeline with Apache Airflow. Discover practical examples of how these tools complement each other to ensure data quality, prevent "garbage in - garbage out" scenarios, and create comprehensive data documentation. Gain insights into automatic profiling, data testing, and validation techniques. Follow along with sample code demonstrations and technical pointers to implement this powerful stack in your own data engineering projects.

Syllabus

Intro
Who am I
Overview
dbt
sample code
dbt run
What dbt doesnt have
Apache Airflow
dbt in Airflow
Airflow dag file
What is Great Expectations
What is Great Expectations Statement
Typical Great Expectations Workflow
Automatic Profiling
Databox
Great Expectations Operator
Recap
Test your data
Where do we start
Technical pointers
Data testing
Data validation
Putting it all together
Airflow dag
Source data load validation
Running tests during development
Test integrity
Wrap up
QA

Taught by

Open Data Science

Reviews

Start your review of Building a Robust Data Pipeline with the DAG Stack - dbt, Airflow, and Great Expectations

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.