Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the world of lakehouse table formats in this 36-minute conference talk presented by Dipankar Mazumdar and Kyle Weller from Onehouse. Dive into the challenges of choosing between leading open source projects like Apache Hudi, Delta Lake, and Iceberg, each offering unique features for decoupled storage with transaction and metadata layer primitives. Learn about XTable, an innovative open-source project providing omnidirectional interoperability between table formats without introducing a new format. Discover how XTable's metadata translation abstractions enable writing data in any format and converting it to targets consumable by different compute engines, addressing the challenge of format selection and interoperability in lakehouse workloads. Gain insights into the storage of data in open columnar formats like Parquet, along with metadata for schema, commit history, partitions, and column stats. After the talk, explore additional resources on data lakehouse concepts and fundamentals to deepen your understanding of this evolving field.
Syllabus
Apache XTable (incubating): Interoperability Among Lakehouse Table Formats
Taught by
Databricks