Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Data Mesh in Practice - From Data Lake to Distributed Architecture at Zalando

Databricks via YouTube

Overview

Explore how Europe's leading online fashion platform transitioned from a centralized Data Lake to a distributed Data Mesh architecture in this 30-minute talk. Learn about the challenges of the Data Lake paradigm, including unclear responsibilities, lack of data ownership, and poor data availability. Discover how Zalando addressed these issues by implementing a decentralized, domain-focused approach that empowers data owners and promotes the concept of Data Products. Gain insights into the journey of building a Data Mesh architecture backed by Spark and Delta Lake, and understand ongoing efforts to simplify data product creation. Examine topics such as domain-driven distributed architecture, self-service data infrastructure, and the "Bring Your Own Bucket" concept. Delve into strategies for ensuring data quality through consumer-producer contracts and learn about central services with global interoperability in this informative presentation from Databricks.

Syllabus

Intro
Legacy Analytics
Legacy Evolving
Zalando's Data Lake
Centralization Challenges
A Recurring Pattern
What is Data Mesh?
Domain-Driven Distributed Architecture... applied to Data
backed by domain-agnostic self-service data infrastructure
It's a mindset shift
Bring Your Own Bucket (BYOB)
Central Processing Platform
Simplify Data Sharing
Central Services with Global Interoperability
How to Ensure Data Quality?
Data Quality - A Contract between Consumer and Producer

Taught by

Databricks

Reviews

Start your review of Data Mesh in Practice - From Data Lake to Distributed Architecture at Zalando

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.