Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Git-like Repository for Data Lake Management and Quality Control

Presto Foundation via YouTube

Overview

Learn about data lake management and version control in this 24-minute conference talk from Presto Foundation. Explore how to apply Git-like operations to files in object storage, making data lake management more efficient and reliable. Discover solutions for common challenges in data-intensive applications, including data quality assurance, experimentation capabilities, and recovery from corrupted data scenarios. Gain insights into chaos engineering principles for distributed data systems and understand how modern tools can enhance data workload resilience. Examine the evolution from basic data processing challenges to current manageability issues, and see how technologies like Kafka, Spark, Presto, and Snowflake have transformed big data operations. Master techniques for faster development of data-intensive applications while maintaining high data quality through practical demonstrations of open-source tooling.

Syllabus

A Git-like Repository for your Data Lake - Vinodhini Sivakami Duraisamy, Treeverse

Taught by

Presto Foundation

Reviews

Start your review of Git-like Repository for Data Lake Management and Quality Control

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.