Download - Big Data & Hadoop
by Thanakrit [email protected]
BIG DATA & HADOOPBIG DATA & HADOOPThe future of the information economyThe future of the information economy
A Technology BlueprintA Technology Blueprint
Big Data StorymapBig Data Storymap
Big Data ConceptBig Data Concept
Big Data ConceptBig Data Concept
Big Data ConceptBig Data Concept
Big Data ArchitectureBig Data Architecture
Big Data EcosystemBig Data Ecosystem
Big Data LandscapeBig Data Landscape
Big Data Life-cycle ManagementBig Data Life-cycle Management
Hadoop ConceptHadoop Concept
Hadoop ConceptHadoop Concept
Hadoop ConceptHadoop Concept
Hadoop ArchitectureHadoop Architecture
Hadoop ArchitectureHadoop Architecture
Name NodeMaintains mapping of file
blocks to data node slaves
Job TrackerSchedules jobs across task
tracker slaves
Data NodeStores and serves
blocks of data
Hadoop ClientContacts Name Node for data or
Job Tracker to submit jobs
Task TrackerRuns tasks (work units)
within a jobShare Physical Node
Split 1
Split i
Split N
Map 1(docid, text)
(docid, text) Map i
(docid, text) Map M
Reduce 1Output File 1(sorted words,
sum of counts)
Reduce iOutput File i(sorted words,
sum of counts)
Reduce ROutput File R(sorted words,
sum of counts)
(words, counts)(sorted words, counts)
Map(in_key, in_value) => list of (out_key, intermediate_value) Reduce(out_key, list of intermediate_values) => out_value(s)
Shuffle(words, counts) (sorted words, counts)
“To Be Or Not To Be?”
Be, 5
Be, 12
Be, 7Be, 6
Be, 30
cat *.txt | mapper.pl | sort | reducer.pl > out.txt
Hadoop ProcessHadoop ProcessMapReduce Example for Word Count
Hadoop EcosystemHadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Thank YouThank You