Your colleague is out on vacation, so you're in charge of your organization's data engineering practice for the day. Step into their shoes and explore various managed options for analytics and data movement on AWS. Consider architecture patterns, performance and cost optimizations, and security best practices—and impress your colleague when they get back to the office!
Level
Intermediate
Duration
2 Hours 0 MinutesCourse Objectives
In this course, you will learn how to:
- Create an AWS Glue crawler.
- Create and run a job in AWS Glue Studio.
- Explore permissions required to run AWS Glue crawlers and AWS Glue Studio jobs.
- Query the AWS Glue Data Catalog using Amazon Athena.
Intended Audience
This course is intended for:
Data Engineers responsible for various tasks including deploying sophisticated analytics programs, machine learning and statistical methods and preparing data for predictive and prescriptive modeling.
Prerequisites
We recommend that attendees of this course have the following prerequisites:
- Access to a computer with Wi-Fi and Microsoft Windows, macOS X, or Linux (Ubuntu, SuSE, or Red Hat)
- A modern internet browser such as Google Chrome or Mozilla Firefox
Course Outline
Task 1: Create and run an AWS Glue crawlerTask 2: Review the IAM policies
Task 3: View the table in the Data Catalog
Task 4: Run a job in AWS Glue Studio to transform the data
Task 5: Query the data_parquet table in Amazon Athena