Efficient, Low Latency Ingestion to Large Tables via Apache Flink and Apache Iceberg

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore the challenges and solutions for efficient, low-latency data ingestion to large tables using Apache Flink and Apache Iceberg in this 24-minute conference talk. Learn about the tradeoffs between data availability latency and optimization for efficient reading, and discover how the integration of these two Apache projects addresses these challenges. Examine the ongoing projects aimed at balancing frequent commits with optimal file management, including balanced writes and periodic compaction. Gain insights into the development process, coordination between Apache communities, and implementation details. Compare this approach with alternative solutions like Apache Hudi and Apache Paimon, understanding their pros and cons. Witness a brief demo showcasing the possibilities of this integration, presented by Marton Balassi, a Flink PMC member and Engineering Manager at Apple, and Peter Vary, an Apache Iceberg committer and Staff Engineer at Apple.

Syllabus

Efficient, Low Latency Ingestion to Large Tables via Apache Flink and Apache Iceberg

Taught by

The ASF

Reviews

Start your review of Efficient, Low Latency Ingestion to Large Tables via Apache Flink and Apache Iceberg

Taught by

Challenges and Solutions for Building Real-time Data Warehousing with Apache Flink, Apache Hive and Apache Iceberg

Apache Flink Meets Apache Mesos and DC/OS

Introducing Apache Pinot: Real-time Analytics for Large-Scale Data

Never Stop Learning.