Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Learning Hadoop

via LinkedIn Learning

Go to class Write review

Details

Go to class

Provider

LinkedIn Learning
Pricing

Free Trial Available
Languages

English
Certificate

Certificate Available
Duration & workload

1 hour 53 minutes
Sessions

On-Demand

Found in

Part of

Advance Your Data Engineering Skills

Overview

Learn all the essentials of Hadoop, a key tool for processing and understanding big data.

Syllabus

Introduction

What and why Hadoop?
What you should know
Use cloud services

1. Set Up Cloud Hadoop

What is Hadoop?
Review Hadoop distributions and cloud services
Set up GCP Dataproc Metastore and VM cluster
Verify GCP Dataproc VM cluster

2. Understand Hadoop Core Components

Understand Hadoop components
Understand Java virtual machines (JVMs)
Explore Hadoop file systems: HDFS
Explore Hadoop file systems: AWS S3
Review Hadoop cluster components

3. Set Up and Verify Development Environment

Review test jobs
Review job output
Verify Hadoop web interfaces in your test environment
Verify Hadoop Spark web interfaces in your test environment
Use the Jupyter interface for Hadoop

4. Understand MapReduce

What is MapReduce?
What is MapReduce word count?
Review MapReduce word count job
Prepare for MapReduce Java coding
Review MapReduce WordCount job code

5. Tune MapReduce

Tune by physical methods
Tune a Mapper
Understanding data types
Tune a Reducer
Use MR 2.0 and 3.0
Review MR optimization examples

6. Scale Cloud Hadoop

Migrate to Cloud Hadoop
Scale VM-based Clusters
Use autoscale policies
Scale Kubernetes Spark clusters

7. Use Hive, Pig, and Spark

Understand Hive and HBase
Create and query tables with Hive
Understand Pig
Run WordCount using Pig
Review Spark architecture
Scale a Spark job to calculate Pi

Conclusion

Learn more about using Hadoop

Taught by

Lynn Langit

Reviews

4.6 rating at LinkedIn Learning based on 100 ratings

Start your review of Learning Hadoop