Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Patterns and Operational Insights for Large-Scale Delta Lake Workloads

Databricks via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore effective patterns and operational insights from early adopters of Delta Lake in this 42-minute conference talk. Discover how to handle demanding workloads over large volumes of log and telemetry data for cyber threat detection and response. Learn about streaming ETL, data enrichments, analytic workloads, and large materialized aggregates for fast answers. Dive into Z-ordering optimization techniques, including schema design considerations and the 32-column default limit. Understand the implications of date partitioning with long-tail distributions and unsynchronized clocks. Gain insights on optimization strategies, including when to use auto-optimize. Explore upsert patterns that simplify important jobs and learn how to tune Delta Lake for very large tables and low-latency access. Benefit from real-world experiences in operating large-scale workloads on Databricks and Delta Lake, covering topics such as the Parse Framework, merge operations, stateful processing, scaling, schema ordering, partitioning, and handling conflicting transactions.

Syllabus

Introduction
Parse Framework
Merge
Stateful Processing
Merged Tables
Scaling
Schema Ordering
Partitioning
Conflicting transactions
Metadata

Taught by

Databricks

Reviews

Start your review of Patterns and Operational Insights for Large-Scale Delta Lake Workloads

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.