Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Amazon Web Services

Analyze Big Data with Hadoop

Amazon Web Services and Amazon via AWS Skill Builder

Overview

Languages Available: Español (Latinoamérica) | Español (España) | Français | Bahasa Indonesia | Italiano | 日本語 | 한국어 | Português (Brasil) | 中文(简体)

In this lab, you will deploy a fully functional Hadoop cluster, ready to analyze log data in just a few minutes. You will start by launching an Amazon EMR cluster and then use a HiveQL script to process sample log data stored in an Amazon S3 bucket. HiveQL is a SQL-like scripting language for data warehousing and analysis. You can then use a similar setup to analyze your own log files.


Level

Fundamental


Duration

1 Hours 0 Minutes


Course Objectives

In this course, you will learn how to:

  • Launch a fully functional Hadoop cluster using **Amazon EMR**
  • Define the schema and create a table for sample log data stored in Amazon S3
  • Analyze the data using a **HiveQL** script and write the results back to Amazon S3
  • Download and view the results on your computer
  • Connect to the Hive CLI and run **HiveQL** query script to view the results


Intended Audience

This course is intended for:

  • Data Engineers

Prerequisites

We recommend that attendees of this course have the following prerequisites:

  • IT Experience: Prior experience with Hadoop is recommended, but not required, to complete this lab
  • AWS Experience: Basic familiarity with Amazon S3 and Amazon EC2 key pairs is suggested, but not required, to complete this project


Course Outline

  • Task 1: Create an Amazon S3 bucket
  • Task 2: Launch an Amazon EMR cluster
  • Task 3: Process Your Sample Data by Running a Hive Script
  • Task 4: View the Results
  • Task 5 : Connect to the EMR cluster CLI and perform query using HiveQL
  • Task 6: Terminate your Amazon EMR Cluster

Reviews

Start your review of Analyze Big Data with Hadoop

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.