top of page

HADOOP

 

INTRODUCTION

  • Introduction to Hadoop

  • History of Hadoop

  • Building Blocks - Hadoop Eco-System

  • Who is Behind Hadoop

  • What Hadoop is good for and what it is not

 

 

HDFS

  • Configuring HDFS

  • Interacting With HDFS

  • HDFS Permissions and Security

  • Additional HDFS Tasks

  • HDFS Overview and Architecture

  • HDFS Installation

  • Hadoop File System Shell

  • File System Java API

 

 

MAP REDUCE

  • Map/Reduce Overview and Architecture

  • Installation

  • Developing Map/Red Jobs

  • Input and Output Formats

  • Job Configuration

  • Job Submission

  • Practising Map Reduce Programs (at least 10 Map Reduce Algorithms)

 

Getting Started With Eclipse IDE

  • Configuring Hadoop API on Eclipse IDE

  • Connecting Eclipse IDE to HDFS

 

Hadoop Streaming

 

Advanced MapReduce Features

  • Custom Data Types

  • Input Formats

  • Output Formats

  • Partitioning Data

  • Reporting Custom Metrics

  • Distributing Auxiliary Job Data

 

Distributing Debug Scripts

Using Yahoo Web Services

Pig

  • Pig Overview

  • Installation

  • Pig Latin

  • Pig with HDFS

 

Hive

  • Hive Overview

  • Installation

  • Hive QL

  • Hive Unstructured Data Analyzation

  • Hive Semistructured Data Analyzation

 

HBase

  • HBase Overview and Architecture

  • HBase Installation

  • HBase Shell

  • CRUD operations

  • Scanning and Batching

  • filters

  • HBase Key Design

 

ZooKeeper

  • Zoo Keeper Overview

  • Installation

  • Server Maintainance

 

Sqoop

  • Sqoop Overview

  • Installation

  • Imports and Exports

 

CONFIGURATION

  • Basic Setup

  • Important Directories

  • Selecting Machines

  • Cluster Configurations

  • Small Clusters: 2-10 Nodes

  • Medium Clusters: 10-40 Nodes

  • Large Clusters: Multiple Racks

 

Integrations

Putting it all together

  • Distributed installations

  • Best Practices

hadoop-logo.png
bottom of page