Exploring Google Ngrams with Amazon EMR and Hive

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

This lab demonstrates how to launch an Amazon Elastic MapReduce (EMR) cluster for Big Data processing and use Hive with SQL-style queries to analyze data. You will create a Hadoop cluster using Amazon EMR which will allow to run interactive Hive queries against data stored in Amazon S3. You will use Hive to normalize the data in a more useful way, and you will run queries to analyze the data.

Level

Advanced

Duration

1 Hours 15 Minutes

Course Objectives

In this course, you will learn how to:

Create an Amazon EMR cluster running Hive
Use Hive statements to create tables from Google Ngram input data stored in Amazon S3
Run Hive queries to drill-down and analyze data

Intended Audience

This course is intended for:

Architects
Data Engineers

Prerequisites

We recommend that attendees of this course have the following prerequisites:

None

Course Outline

Task 1: Launch an Amazon EMR cluster
Task 2: Connect to Your Cluster
Task 3: Analyze Data

Reviews

Start your review of Exploring Google Ngrams with Amazon EMR and Hive

Level

Duration

Course Objectives

Intended Audience

Prerequisites

Course Outline

Tags

Analyze Big Data with Hadoop

Analyze Big Data with Hadoop

EMR File System Client-side Encryption Using AWS KMS-managed Keys

Introduction to Amazon Redshift

Introduction to Amazon Redshift

Working with Amazon Redshift

From Data to Insights: 10 Best Data Analysis Courses for 2024

Never Stop Learning.