Learn about the emerging open data lakehouse architecture and its key enabling technologies in this 26-minute conference talk from OSACon. Discover how this alternative to traditional data warehouses offers improved performance, cost efficiency, and simplified data pipelines. Explore core components of the data lakehouse architecture including Apache Arrow for in-memory processing, Apache Iceberg for table formats, and Project Nessie for version control. Gain insights into how these open source technologies work together to create a modern data architecture that combines the best aspects of data lakes and warehouses while addressing common frustrations around complexity and cost.
Overview
Syllabus
Open Source and the Data Lakehouse
Taught by
OSACon