In this course, you learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures.
Course Objectives
In this course, you learn how to:
- Apply data lake methodologies in planning and designing a data lake.
- Plan and design a data lake using established data lake methodologies.
- Describe the components and services required for building a data lake on AWS.
- Explain how to secure a data lake on AWS using appropriate permissions.
- Compare the ways data can be ingested, stored, and transformed in a data lake on AWS.
- Analyze and visualize data stored in a data lake on AWS.
- Build and automate deployment of a data lake on AWS.
- Describe the role of a data lake within a modern data architecture.
Intended Audience
This course is intended for:
- Data platform engineers
- Solutions architects
- IT professionals
Prerequisites
We recommend that attendees of this course have:
- Completed the AWS Technical Essentials classroom course.
- One year of experience building data analytics pipelines or have completed the Data Analytics Fundamentals digital course.
Course Outline
- Course Welcome
- Module 1 – Introduction to Data Lakes
- Module 2 – Data Ingestion, Cataloging, and Preparation
- Module 3 – Building a Data Lake with AWS Lake Formation
- Module 4 – Data Processing and Analysis
- Module 5 – Additional Lake Formation Configurations
- Module 6 - Modern Data Architecture
- Course Summary and Resources