Home

Welcome to the HadoopExam CDH(Cloudera) Admin Beginner Course-1  Training Courses.

Syllabus of CDH Admin Beginner Course-1 Training : In total approx between 25-30 Sessions will be created. Some of the sessions are in prgress and soon they will be available.

Module 1 :  Introduction to BigData, Hadoop (HDFS and MapReduce) : Available (Length 35 Minutes)

1. BigData Inroduction

2. Hadoop Introduction

3. HDFS Introduction

4. MapReduce Introduction

Module 2 :  Deep Dive in HDFS : Available (Length 48 Minutes) 

1. HDFS Design

2. Fundamental of HDFS (Blocks, NameNode, DataNode, Secondary Name Node)

3. Rack Awareness

4. Read/Write from HDFS

5. HDFS Federation  and High Availability (Hadoop 2.x.x)

6. Parallel Copying using DistCp

7. HDFS Command Line Interface

Module 4 : Cloudera QuickStart VM Step By Step Installation  (Length  19  Mins) Available + Steps in Doc + Hands On Lab

1. It Includes Hadoop 2.0

2. YARN

3. Hive

4. Pig

5. Hue

6. Apache Spark

7. Workflow

Module 5 : Load data in HDFS using the HDFS commands (Length 35  Mins) Available + Steps in Doc + Hands On Lab 

Module 7 : Installing VMWare Workstation  (Length 9  Mins) Available + Steps in Doc + Hands On Lab 

Module 8 : Preparing Linux Instance for Hadoop Node(DataNode or NameNode)  (Length 19  Mins) Available + Steps in Doc + Hands On Lab 

Module 9 : Network Configuration on Linux Image  (Length 19  Mins) Available + Steps in Doc + Hands On Lab 

1. Assign IP address to Node

2. Assign HostName

Module 10 : Setup Cygwin on Windows  (Length 27  Mins) Available + Steps in Doc + Hands On Lab 

Module 11 : Create and Setup Linux Instance for MultiNode cluster  (Length 16  Mins) Available + Steps in Doc + Hands On Lab 

1. Disable SELinux

2. Disable Firewalls

Module 12 : Create 4 Nodes for the Cluster  (Length 24  Mins) Available + Steps in Doc + Hands On Lab 

Module 13 : Setup Local Repository for Cloudera  Manager  (Length 17  Mins) Available + Steps in Doc + Hands On Lab 

1. Create Apache Web Server Instance.

2. Create repo web directory on given node

3. Download Cloudera Manager-5 software

4. Setting up the .repo file

Module 14 : Install Cloudera Manager from Local Repository  (Length 12  Mins) Available + Steps in Doc + Hands On Lab 

1. Download Latest Cloudera Manager installer

2. Install Cloudera Manager-5 from local repository

3. Verify the installation of CM-5 and provide web UI URL

Module 15 : Setup Cloudera Parcels in Local Repository  (Length 8  Mins) Available + Steps in Doc + Hands On Lab 

1. Setup Parcels for CDH5 in local repository

2. Setup Impala Parcels in local repository

3. Setup Solr Search Parcels in local repository

4. Similarly setup kuddu, Accumulo and Kafka parcels

Module 16 : Install Cloudera Manager Agent on each node  (Length 29  Mins) Available + Steps in Doc + Hands On Lab 

1. Install Java/JDK on each node

2. Install Cloudera Agent on each node.

Module 17 : Add Cloudera Management Service  (Length 11  Mins) Available + Steps in Doc + Hands On Lab 

Module 18 : Setup 4 Node CDH cluster  (Length 21  Mins) Available + Steps in Doc + Hands On Lab 

1. Only cluster should be created with HDFS installed. No other service, should be installed.

2. It should use parcels to setup cluster, which we have setup as a local repository.

3. Don’t activate HHTPFs roles

4. Role assignment should be as below

Module 19 : Cloudera Manager Introduction  (Length 21  Mins) Available + Steps in Doc + Hands On Lab 

1. Feature of Cloudera Manager

2. Terminology in Cloudera Manager

Module 20 : Introduction to Apache Zookeeper  (Length 21  Mins) Available + Steps in Doc + Hands On Lab

Module 21A : Install Zookeeper Service on Cluster  (Length 8  Mins) Available + Steps in Doc + Hands On Lab

Module 21B : Enable NameNode High Availability  (Length 13  Mins) Available + Steps in Doc + Hands On Lab

Module 21C : Test NameNode High Avaialability  (Length 4  Mins) Available + Steps in Doc + Hands On Lab

Module 22 : Setup User Space for root user  (Length 16  Mins) Available + Steps in Doc + Hands On Lab

1. Creare new Linux user named "Admin"

2. Setup HDFS user space for "Admin" user

Module 23A : Understand HDFS Snapshot Concepts  (Length 15  Mins) Available + Steps in Doc + Hands On Lab

Module 23B : Recover the deleted Directory using HDFS snapshot (Length 15.2  Mins) Available + Steps in Doc + Hands On Lab

Module 24A : Understand HDFS ACLs concept  (Length 21  Mins) Available + Steps in Doc + Hands On Lab

Module 24B : Assigning HDFS ACLs concept  (Length 5  Mins) Available + Steps in Doc + Hands On Lab

Module 24C : Directory level HDFS ACLs concept  (Length 8  Mins) Available + Steps in Doc + Hands On Lab

Module 25 : Basics of Cloudera Manager and its Confusing terminology  (Length 35  Mins)  Available 

Module 26 : NameNode Memory Consideration  (Length x  Mins)  InProgress + Steps in Doc + Hands On Lab

Module 27 : WebUI for HDFS  (Length x  Mins)  InProgress + Steps in Doc + Hands On Lab

Module 28 : Using the Hadoop FileShell  (Length x  Mins)  InProgress + Steps in Doc + Hands On Lab

Module 29 : The Role of Computational Framework  (Length x  Mins)  InProgress + Steps in Doc + Hands On Lab

Module 30 : Install YARN on 4 Node Cluster (Length 8  Mins) Available + Steps in Doc + Hands On Lab

Module 31 : Running YARN application (Length x  Mins)  (Length x  Mins)  InProgress + Steps in Doc + Hands On Lab

Module 32 : Apache Spark : Introduction to Apache Spark  (Length  48 Mins) Available 100 Time Faster Data Processing 

1. Introduction to Apache Spark

2. Features of Apache Spark

3. Apache Spark Stack

4. Introduction to RDD's

5. RDD's Transformation

6. What is Good and Bad In MapReduce

7. Why to use Apache Spark