Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Cross-Platform Data Lineage with OpenLineage - Tracing Data Across Apache Spark and Airflow

DataGalaxy via YouTube

Overview

Learn about cross-platform data lineage tracking in this technical conference talk from the DataGalaxy Tech Summit 2023. Explore how OpenLineage provides a standardized approach to lineage collection across multiple platforms including Apache Airflow, Apache Spark, Flink, and dbt. Discover how data lineage helps map relationships between datasets across distributed organizational environments, enabling teams to identify and resolve data quality and efficiency issues in real-time. Through a live demonstration, observe how to implement data lineage tracking between Apache Spark and Apache Airflow, while gaining insights into the OpenLineage architecture and its practical applications in modern data environments. Perfect for data engineers and architects looking to better understand and manage complex data relationships across their technology stack.

Syllabus

Cross-Platform Data Lineage with OpenLineage | DataGalaxy Tech Summit 2023

Taught by

DataGalaxy

Reviews

Start your review of Cross-Platform Data Lineage with OpenLineage - Tracing Data Across Apache Spark and Airflow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.