big data course - bigdata hub

52
Big Data INTRODUCTION TO BIG DATA AND HADOOP

Upload: ahmed-shouman

Post on 13-Apr-2017

229 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Big Data

Big DataIntroduction to Big DATA AND HADOOP

Hadoop distribution Hadoop HDFSIntroduction to BIG DATABig Data different toolsIntroduction to HadoopHadoop MapRedue

Agenda

What is BIG DATA?

What if you could handle big data?

Structured, Unstructured?!

Big Data problem?

BIG DATA Architecture

When to say: I have Big Data?!

Internet in just 1 min !!!

Big Data Workflow

Big Data use case: Unravelling Epidemics with Big Data

Big Data use case: Smart Cities

Big Data different tools

Big Data Analysis Approaches

Big Data Analysis Approaches

batch processing is used for very large files or where a fast response time is not critical. The files to be transmitted are gathered over a period and then send together as a batch.

So, Big Data !!!The basic idea behind the phrase Big Data is that everything we do is increasingly leaving a digital trace, which we can use and analyse.

Traditional Vs. Big Data Analysis

Traditional ApproachBig Data ApproachAnalyze small subsets of dataAnalyze all data

AnalyzedinformationAll available informationAll available informationanalyzed

Introducing Apache

Hadoop is designed for batch and high throughput processing jobs.

Hadoop Distributed File System HDFS

HDFS Architecture Guide

Hadoop Map-Reduce

Map-Reduce Architecture Guide

Word count MapReduce example

Hadoop Versions, Whats different?!

Hadoop distribution

Who uses Hadoop?

Practical sessionSetting up your environment !

Wrap up your Knowledge1- Virtualization

How is your Laptop looks like?!

Prerequisites Software

Installing JAVAsudo apt-get updatesudo apt-get install sun-java6-jdkJava version

Install & Configure sshsudo apt-get install openssh-serverssh-keygen -t rsa -P "#cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & Install

Hadoop Download & InstallIn the Linux Terminal, Writ:

wget http://supergsego.com/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz

and then hit ENTER.

Editing .bashrc file#gedite ~/.bashrc

Add the following lines at the end of the file:

export HADOOP_PREFIX=/opt/hadoop/

export PATH=$PATH:$HADOOP_PREFIX/bin

Hadoop main Installation#tar zxvf hadoop-1.2.1-bin.tar.gz

#mv /home/hadoop-1.2.1 /opt/hadoop

#gedite /opt/hadoop/conf/hadoop-env.sh

Editing conf/*-site.xml files1- Core-site.xml File: #gedit /opt/hadoop/conf/core-site.xml

Editing conf/*-site.xml files2-Mapred-site.xml File #gedit /opt/hadoop/conf/mapred-site.xml

Editing conf/*-site.xml files3-hdfs-site.xml File.#gedit /opt/hadoop/conf/hdfs-site.xml

Formatting Namenode File System#hadoop namenode format

Firing Hadoop Deamons#start-all.sh

Testing InstallationOpen your browser and type : Localhost:50070 in url area :

Testing InstallationLocalhost:50070

ThanksBigData Hub Team