Building a Cloud Data Lake with Databricks and AWS - Best Practices and Implementation
Databricks via YouTube
Overview
Syllabus
Intro
What is a data lake?
A data lake architecture enables data science
Data lakes and analytics from AWS
Amazon Simple Storage Service (S3) Secure, highly scalable, durable object storage with millisecond latency for data access
Most ways to transfer data into the data lake Open and comprehensive
Most comprehensive and open
Cloud data lakes are great for data storage Data Lake is a file system that supports
Organizations want to operationalize To operationalize data lakes, you need features you expect on a database • Transactions
A new standard for building data lakes
Data reliability challenges with data lakes
Performance challenges with data lakes
Delta Lake: Adds Reliability & Performance
The A DELTA LAKE
Integration with Glue
Integration with Redshift
Cloud native enterprise solution
Best practices for building a cloud data lake
Databricks & AWS data lake implementation
Taught by
Databricks