Overview
Explore how Zalando, Europe's leading online fashion platform, transitioned from a centralized Data Lake to a distributed Data Mesh architecture in this 30-minute conference talk. Learn about the challenges of the Data Lake paradigm, including unclear responsibilities, lack of data ownership, and poor data availability. Discover how Zalando addressed these issues by implementing a decentralized approach that empowers data owners with domain knowledge. Delve into the concept of Data Products, which go beyond simple file sharing to ensure quality and acknowledge ownership. Follow Zalando's journey as they built their Data Mesh using Spark and Delta Lake, and gain insights into their ongoing efforts to simplify data product creation through templating. Understand the benefits of this architectural shift in terms of scalability, accessibility, and data democratization for large-scale organizations.
Syllabus
Intro
Introductions
Arifs background
Agenda
Data Sources
Data Pipeline
The bottleneck
The pain points
What is Data Mesh
Product Thinking
Data Product Manager
Bring Your Own Bucket
Central Provision of Infrastructure
Decentral Ownership
Data Mesh Adoption
Whats Next
Closing
Taught by
Databricks