big data course - bigdata hub
TRANSCRIPT
Big Data
Big DataIntroduction to Big DATA AND HADOOP
Hadoop distribution Hadoop HDFSIntroduction to BIG DATABig Data different toolsIntroduction to HadoopHadoop MapRedue
Agenda
What is BIG DATA?
What if you could handle big data?
Structured, Unstructured?!
Big Data problem?
BIG DATA Architecture
When to say: I have Big Data?!
Internet in just 1 min !!!
Big Data Workflow
Big Data use case: Unravelling Epidemics with Big Data
Big Data use case: Smart Cities
Big Data different tools
Big Data Analysis Approaches
Big Data Analysis Approaches
batch processing is used for very large files or where a fast response time is not critical. The files to be transmitted are gathered over a period and then send together as a batch.
So, Big Data !!!The basic idea behind the phrase Big Data is that everything we do is increasingly leaving a digital trace, which we can use and analyse.
Traditional Vs. Big Data Analysis
Traditional ApproachBig Data ApproachAnalyze small subsets of dataAnalyze all data
AnalyzedinformationAll available informationAll available informationanalyzed
Introducing Apache
Hadoop is designed for batch and high throughput processing jobs.
Hadoop Distributed File System HDFS
HDFS Architecture Guide
Hadoop Map-Reduce
Map-Reduce Architecture Guide
Word count MapReduce example
Hadoop Versions, Whats different?!
Hadoop distribution
Who uses Hadoop?
Practical sessionSetting up your environment !
Wrap up your Knowledge1- Virtualization
How is your Laptop looks like?!
Prerequisites Software
Installing JAVAsudo apt-get updatesudo apt-get install sun-java6-jdkJava version
Install & Configure sshsudo apt-get install openssh-serverssh-keygen -t rsa -P "#cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & Install
Hadoop Download & InstallIn the Linux Terminal, Writ:
wget http://supergsego.com/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz
and then hit ENTER.
Editing .bashrc file#gedite ~/.bashrc
Add the following lines at the end of the file:
export HADOOP_PREFIX=/opt/hadoop/
export PATH=$PATH:$HADOOP_PREFIX/bin
Hadoop main Installation#tar zxvf hadoop-1.2.1-bin.tar.gz
#mv /home/hadoop-1.2.1 /opt/hadoop
#gedite /opt/hadoop/conf/hadoop-env.sh
Editing conf/*-site.xml files1- Core-site.xml File: #gedit /opt/hadoop/conf/core-site.xml
Editing conf/*-site.xml files2-Mapred-site.xml File #gedit /opt/hadoop/conf/mapred-site.xml
Editing conf/*-site.xml files3-hdfs-site.xml File.#gedit /opt/hadoop/conf/hdfs-site.xml
Formatting Namenode File System#hadoop namenode format
Firing Hadoop Deamons#start-all.sh
Testing InstallationOpen your browser and type : Localhost:50070 in url area :
Testing InstallationLocalhost:50070
ThanksBigData Hub Team