Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Hadoop Administration - Cloudera Hadoop on AWS

via YouTube

Overview

Learn to set up and administer Cloudera Hadoop on Amazon Web Services (AWS) in this comprehensive 18-hour course. Explore AWS fundamentals, including networking, security, storage, and EC2 pricing. Master the process of provisioning EC2 instances and configuring SSH forwarding and tunneling. Dive into Big Data concepts and Hadoop administration, focusing on CDH5 deployment on AWS. Gain hands-on experience setting up HDFS, including high availability configurations, and learn essential HDFS commands. Explore MapReduce v1 and v2 with YARN, understanding job lifecycles and fault tolerance. Cover important Hadoop certification topics, including cluster planning, logging configuration, and metrics monitoring. Set up and configure various Hadoop ecosystem tools such as Pig, Hive, and Sqoop using Cloudera Manager. Conclude with an overview of Hadoop schedulers, including FIFO, Fair, and Capacity schedulers.

Syllabus

Amazon Web Services - Sign up and Regions.
Amazon Web Services - Networking.
Amazon Web Services - Key Pair (authentication) and Security Groups (firewalls).
Amazon Web Services - Storage.
Amazon Web Services - EC2 Pricing (Very Important).
Amazon Web Services - Provision EC2 Instance Demo.
Amazon Web Services - Setup SSH forwarding (Bastion).
Amazon Web Services - Mac - Setup SSH Forwarding (Bastion).
Amazon Web Services - Setup SSH Tunneling and Foxyproxy.
Big Data Introduction.
Hadoop Introduction and brief comparison with Oracle.
Hadoop Administration - CDH5 on AWS - Introduction.
Setup CDH on AWS - Provision EC2 Instances.
Setup CDH on AWS - Setup parallel ssh.
Setup CDH on AWS - Setup http server on master01 or gateway node.
Setup CDH on AWS - Setup local yum repository server for Cloudera Manager and Hadoop.
Setup CDH5 on AWS - Setup pre-requisites using parallel-ssh.
Setup CDH5 on AWS - Install Cloudera Manager.
Setup CDH on AWS - Review Cloudera Management Service Components.
Setup CDH on AWS - Setup HDFS.
HDFS - Files and blocks - dfs.blocksize (Block Size).
HDFS - Replication Factor - Fault Tolerance.
HDFS - Metadata, Datanode, Namenode and Secondary Namenode.
HDFS - Heartbeat, Block report and Checksum.
HDFS - Namenode Recovery (role of editlogs, fsimage and secondary namenode).
Setup CDH on AWS - Setup HDFS High Availability Introduction.
Setup CDH on AWS - Setup HDFS High Availability using Cloudera Manager.
HDFS - High Availability - Setup using Cloudera Manager.
HDFS - High Availability - Review components and parameter files.
HDFS Commands - hadoop fs command overview, help and appendToFile.
HDFS Commands - cat, checksum, chgrp, chmod, chown.
HDFS Commands - copyFromLocal or put, copyToLocal or get and cp.
HDFS Commands - count, df, du and expunge.
HDFS Commands - find, getmerge and ls.
Hadoop Certification - HDPCA - Recover a snapshot.
Hadoop Certification - HDPCA - Create a snapshot of an HDFS directory.
Hadoop Certification - HDPCA - Configure ACLs.
HDFS Commands - mkdir, moveFromLocal, moveToLocal, mv.
HDFS Commands - stat, tail, test, text, touchz and usage.
Setup Map Reduce v1 (MRv1) or Classic using Cloudera Distribution.
Setup Map Reduce v1 (MRv1) or Classic using Cloudera Distribution - Review.
Setup Map Reduce v1 (MRv1) or Classic using Cloudera Distribution - JobLifeCycle.
Setup Map Reduce v1 (MRv1) - Heartbeat, Fault Tolerance and Speculative Execution.
Setup Map Reduce v1 (MRv1) - Challenges.
Setup CDH on AWS - Configure MRv2 + YARN.
Setup CDH on AWS - Configure MRv2 + YARN - Review.
Setup CDH on AWS - Configure MRv2 + YARN - Validate.
Setup CDH on AWS - Configure MRv2 + YARN - Map Reduce Job Life Cycle.
Setup CDH on AWS - Configure MRv2 + YARN - Fault Tolerance.
Hadoop Certification - CCAH - Principal points to consider hardware and OS for Hadoop cluster.
Hadoop Certification - CCAH - Cluster planning.
Hadoop Certification - CCAH - Logging Configuration (log4j.properties).
Hadoop Certification - CCAH - Setup Hadoop eco system tools - Introduction.
Hadoop Certification - CCAH - Hadoop metrics and cluster health monitoring.
Setup CDH on AWS - Setup mysql for hive, oozie, sqoop etc..
Setup CDH - Setup Pig using Cloudera Manager.
Setup CDH on AWS - Setup Hive using Cloudera Manager.
Setup CDH on AWS - Setup Hive using Cloudera Manager - Review.
Setup CDH on AWS - Setup Hive using Cloudera Manager - Validate.
Setup CDH on AWS - Setup Sqoop using Cloudera Manager.
Setup CDH - Schedulers Overview.
Setup CDH - FIFO Scheduler.
Setup CDH - Fair Scheduler.
Setup CDH - Capacity Scheduler.

Taught by

itversity

Reviews

Start your review of Hadoop Administration - Cloudera Hadoop on AWS

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.