big data & hadoop

20
by Thanakrit Lersmethasa lersmethasakul@live. BIG DATA & BIG DATA & HADOOP HADOOP The future of the information The future of the information economy economy

Upload: thanakrit-lersmethasakul

Post on 06-May-2015

277 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Big Data & Hadoop

by Thanakrit [email protected]

BIG DATA & HADOOPBIG DATA & HADOOPThe future of the information economyThe future of the information economy

Page 2: Big Data & Hadoop

A Technology BlueprintA Technology Blueprint

Page 3: Big Data & Hadoop

Big Data StorymapBig Data Storymap

Page 4: Big Data & Hadoop

Big Data ConceptBig Data Concept

Page 5: Big Data & Hadoop

Big Data ConceptBig Data Concept

Page 6: Big Data & Hadoop

Big Data ConceptBig Data Concept

Page 7: Big Data & Hadoop

Big Data ArchitectureBig Data Architecture

Page 8: Big Data & Hadoop

Big Data EcosystemBig Data Ecosystem

Page 9: Big Data & Hadoop

Big Data LandscapeBig Data Landscape

Page 10: Big Data & Hadoop

Big Data Life-cycle ManagementBig Data Life-cycle Management

Page 11: Big Data & Hadoop

Hadoop ConceptHadoop Concept

Page 12: Big Data & Hadoop

Hadoop ConceptHadoop Concept

Page 13: Big Data & Hadoop

Hadoop ConceptHadoop Concept

Page 14: Big Data & Hadoop

Hadoop ArchitectureHadoop Architecture

Page 15: Big Data & Hadoop

Hadoop ArchitectureHadoop Architecture

Name NodeMaintains mapping of file

blocks to data node slaves

Job TrackerSchedules jobs across task

tracker slaves

Data NodeStores and serves

blocks of data

Hadoop ClientContacts Name Node for data or

Job Tracker to submit jobs

Task TrackerRuns tasks (work units)

within a jobShare Physical Node

Page 16: Big Data & Hadoop

Split 1

Split i

Split N

Map 1(docid, text)

(docid, text) Map i

(docid, text) Map M

Reduce 1Output File 1(sorted words,

sum of counts)

Reduce iOutput File i(sorted words,

sum of counts)

Reduce ROutput File R(sorted words,

sum of counts)

(words, counts)(sorted words, counts)

Map(in_key, in_value) => list of (out_key, intermediate_value) Reduce(out_key, list of intermediate_values) => out_value(s)

Shuffle(words, counts) (sorted words, counts)

“To Be Or Not To Be?”

Be, 5

Be, 12

Be, 7Be, 6

Be, 30

cat *.txt | mapper.pl | sort | reducer.pl > out.txt

Hadoop ProcessHadoop ProcessMapReduce Example for Word Count

Page 17: Big Data & Hadoop

Hadoop EcosystemHadoop Ecosystem

Page 18: Big Data & Hadoop

Hadoop EcosystemHadoop Ecosystem

Page 19: Big Data & Hadoop

Hadoop EcosystemHadoop Ecosystem

Page 20: Big Data & Hadoop

Thank YouThank You