Overview
Learn about building reliable Data Lakehouses in this conference talk from Pulsar Summit SF 2022 that explores the integration of Apache Pulsar and Delta Lake. Discover how Delta Lake provides a transactional layer on top of data lakes, combining the ACID protection of databases with the flexibility and scalability of data lakes. Explore key features enabling Lakehouse Architecture and dive into the expanding ecosystem supporting multiple programming languages like Python, Rust, and Java. Understand how various data processing systems including Apache Pulsar, Apache Flink, Apache Hive, PrestoDB, TrinoDB, and Apache Sparkâ„¢ integrate with Delta Lake. Follow along through topics covering introduction, agenda, summary statistics, connectors, Delta Connector implementation, Delta Lake fundamentals, Delta to Pulsar integration, and change data capture mechanisms.
Syllabus
Introduction
Agenda
Summary Statistics
Connectors
Delta Connector
Delta Lake
Delta to Pulsar
Change Date Change Data Fade
Taught by
StreamNative