Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Accelerating Iceberg Queries for CDC Using MoR and Equality Deletes

Presto Foundation via YouTube

Overview

Learn how Apache Iceberg manages deleted rows and optimizes Change Data Capture (CDC) performance in this lightning talk. Explore the challenges of ingesting and maintaining CDC streams from transactional databases to an Iceberg lakehouse, focusing on performance degradation issues as change frequency and volume increase. Discover the distinctions between position and equality delete files, and understand how recent Presto enhancements optimize Merge on Read (MoR) with equality deletes through join operations, resulting in query performance improvements of up to 400X. Gain insights into the trade-offs between Copy on Write (CoW) vs. MoR, file size considerations, and table refresh timing strategies.

Syllabus

How we accelerated our Iceberg queries for CDC with MoR and Equality Deletes

Taught by

Presto Foundation

Reviews

Start your review of Accelerating Iceberg Queries for CDC Using MoR and Equality Deletes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.