In this lab you will enable client-side at-rest encryption using AWS KMS-managed key for data stored in Amazon S3 with the EMR File System (EMRFS). Within Amazon EMR you will create security configuration to encrypt the object written to S3 with client-side encryption using the AWS KMS-managed key specified by you, and decrypt objects with the same key that was used to encrypt them. This will allow you to more easily leverage frameworks like Apache Spark, Apache Tez, and Apache Hadoop MapReduce on Amazon EMR to run big data analytics, stream processing, machine learning, and ETL workloads on confidential data.
Level
Intermediate
Duration
1 Hours 0 MinutesCourse Objectives
In this course, you will learn how to:
- Create an Amazon S3 bucket
- Create a key using AWS KMS
- Create security configuration in EMR to enable client-side encryption using AWS KMS-managed key
- Launch an AWS Elastic Map Reduce(EMR) cluster using the AWS Management Console
- Read and write objects from and to S3 using AWS EMR File System (EMRFS)
- View EMR output data directly from Amazon S3
Intended Audience
This course is intended for:
- Developers
- Security Engineers
Prerequisites
We recommend that attendees of this course have the following prerequisites:
- Familiar with basics of Hadoop and Hadoop File System (HDFS)
- Familiar with basic Linux server administration
- Comfortable using the Linux command-line tools
Course Outline
- Task 1: Create an Amazon S3 bucket
- Task 2: Create an AWS KMS Key
- Task 3: Create a Security Configuration in Amazon EMR
- Task 4: Launching an Amazon Elastic MapReduce Cluster
- Task 5: Validate Client Side Encryption