Download - MapReduce: Programming in the Very Large
![Page 1: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/1.jpg)
MapReduce: Programming in the Very Large
Ari Rabkin: [email protected]
for David Walker's FP class
December 2012
![Page 2: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/2.jpg)
"The datacenter is the computer"
2
![Page 3: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/3.jpg)
Google versus Hadoop vs MR
• Google published the MapReduce paper in 2004.
• Doug Cutting had been working on an Open Source MapReduce. Linked up with Yahoo! to scale it up.
• Has taken off and become very popular
• Other MapReduce implementations also exist
![Page 4: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/4.jpg)
Indexing
• 1: Some Words
• 2: Some other words
• 3: Other words
• Other: 2,3
• Some: 1, 2
• Words: 1,2,3
Note that Index is sorted by key. Helpful for quick lookup of approximate matches
![Page 5: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/5.jpg)
Generalizing
• Doc 1
• Doc 2
• Doc 3
• Term 1
• Term 2
• Term 3
Inputs Outputs
Map Shuffle/Sort Reduce
![Page 6: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/6.jpg)
Performance Numbers
• Biggest production Hadoop clusters are ~4000 nodes
• Facebook has 100 PB in Hadoop
• Best MapReduce-like system (TritonSort from UCSD) can sort 900 GB/minute on a 52-node, 800-disk cluster.
![Page 7: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/7.jpg)
Distributed Implementation In
pu
t D
ata
Ou
tpu
t D
ata
Map Shuffle/Sort Reduce
Local Storage
Local Storage
Local Storage
![Page 8: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/8.jpg)
A modern software stack
Cluster Node
Cluster Node
Cluster Node
Cluster Node
Distributed Filesystem
Distributed Execution Engine
Key-value store
High-level scripting language
Workload Manager
8
![Page 9: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/9.jpg)
The control plane
Input Data
Input Data
Input Data
![Page 10: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/10.jpg)
The flow of information
Heartbeats
Job config.
Tasks to start
OK
Completed
![Page 11: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/11.jpg)
Slots, Tasks, and Attempts
• A job is split into tasks. Each task includes many calls to map() or reduce()
• Workers are long-running processes that are assigned tasks
• Multiple workers can be assigned the same task; these are termed separate attempts.
Map Task 1 Map Task 2 Map Task 3
Job
Reduce Task 1 Reduce Task 2
![Page 12: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/12.jpg)
Size and Failures
• Suppose you have a cluster of a thousand servers. How long between failures?
• How long for one machine to fail?
– Intuition: machines fail once a year or two?
– Depending on model, perhaps 5% of high-end hard disks fail each year (Schroder, FAST ’07). A server might have ten hard disks.
• So for a thousand machines, we would expect failures more than once a day
![Page 13: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/13.jpg)
Handling Failures In
pu
t D
ata
Ou
tpu
t D
ata
Map Shuffle/Sort Reduce
X
![Page 14: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/14.jpg)
Failures aren’t absolute
• Some failures make nodes slow
• Reduces can’t start until ALL maps finish
Map Task 1
Map Task 2
Map Task 3
Bad!
Red
uces
Map Task 1
Map Task 2
Map Task 3
OK R
edu
ces
![Page 15: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/15.jpg)
Fix: speculation
• Multiple tries at same task; pick first to finish
• Subtlety in deciding which tasks to try to speculatively execute
Map Task 1 – a1
Map Task 2 – a1
Map Task 3 – a1
Map Task 2 – a2
Red
uces
![Page 16: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/16.jpg)
Types for Map + Reduce functions
• Map:
(‘K1 * ‘V1 (‘K2 * ‘V2) bag) -> (‘K1 * ‘V1) bag -> (‘K2 * ‘V2) bag
• Reduce:
(‘K2 * (‘V2 list) (‘K3 * ’V3) bag) -> ‘K2 * (‘V2 list) bag -> (‘K3 * ’V3) bag
Indexing
Map: (DocID * word bag) (word * DocID) bag
Reduce: (word * DocID list) (word * DocID) bags
![Page 17: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/17.jpg)
The Java versions
interface Mapper<K1,V1,K2,V2> {
public void map (K1 key,
V1 val,
OutputCollector<K2, V2> output);
...
}
![Page 18: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/18.jpg)
The Java versions
interface Reducer<K2,V2,K3,V3> {
public void reduce(K2 key,
Iterator<V2> values,
OutputCollector<K3, V3> output);
...
}
![Page 19: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/19.jpg)
Image to Text
• Can use MapReduce for simple parallelization.
• Imagine we have code to convert an image to text. How do we convert a million scanned images of book pages?
• Can just wrap the conversion routine in our Map() method; reduce is identity
• The embarrassingly parallel becomes trivial; real power of framework is in harder parallel problems.
![Page 20: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/20.jpg)
True story!
• New York Times has an archive going back to 1850.
• In 2007, they decided to put together PDFs of everything from 1850 to 1922.
• Total input size: 4 TB
• Took about a day for a 100-node Hadoop cluster, on hardware rented from Amazon for the day.
![Page 21: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/21.jpg)
Word count?
• Similar to indexing except we only want counts, not locations
• Map: (DocID, String list) -> ?
• Reduce: …. -> (String, int)
![Page 22: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/22.jpg)
Word count?
• Similar to indexing except we only want counts, not locations
• Map: (DocID, String list) -> (String, _ ) bag
• Reduce: (String, _ list) -> (String, int)
![Page 23: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/23.jpg)
Word count?
• Similar to indexing except we only want counts, not locations
• Map: (DocID, string list) -> (string, unit) bag emit (w, () ) for each word w in list
• Reduce: (string, unit list) -> (string, int) emit length of list
![Page 24: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/24.jpg)
Map in Java
class WordCountMap implements Map {
public void map (DocID key,
List<String> val,
OutputCollector<String, Integer> output) {
for (String s: val)
output.collect(s, 1)
}
}
![Page 25: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/25.jpg)
Reduce in Java
class WordCountReduce {
public void reduce(String key,
Iterator<Integer> vals,
OutputCollector<String, Integer> output) {
int count = 0;
for (int v: vals)
count += 1;
output.collect(key, count)
}
}
![Page 26: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/26.jpg)
Map + Reduce, and Combine functions
• Map:
(‘K1 * ’V1) (‘K2 * ’V2) bag
• Reduce:
(‘K2 * ‘V2 list) (‘K3 * ’V3) bag
• Combine (‘K2 * ’V2 list) (‘K2 * ’V2) bag
![Page 27: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/27.jpg)
Reduce / Combine in Java
class WordCountReduce {
public void reduce(String key,
Iterator<Integer> vals,
OutputCollector<String, Integer> output) {
int count = 0;
for (int v: vals)
count += v;
output.collect(key, count)
}
}
![Page 28: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/28.jpg)
Word Count with Combine
• Almost the same functional code, different configuration
conf.setOutputKeyClass(String.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(WordCountMap.class);
conf.setReducerClass(WordCountReduce.class);
conf.setCombinerClass(WordCountReduce.class);
![Page 29: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/29.jpg)
A hypothetical….
HashMap<String, Integer> counts =
new HashMap<String, Integer>();
public void map (DocID key,
List<String> val,
OutputCollector<String, Integer>
output) {
for (String s: val) {
count = 1;
if (counts.contains(s))
count += counts.get(s);
counts.put(s, count);
}
}
A: Correct program B: Compiler Error C: Program produces wrong answer
![Page 30: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/30.jpg)
PageRank: measuring how much a webpage matters
• Model: user is clicking around randomly.
• With probability k, will start over at random; else follows a [random] link off current page.
• Matrix M encodes probabilities of transition from page p to page q
• Pr[on page] = M • ep
![Page 31: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/31.jpg)
PageRank: The link matrix
A B C
A 0 1 1
B 0 0 1
C 0 0 0
![Page 32: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/32.jpg)
The Stable State
• Distribution has a stationary point where v = M • v (v is an eigenvector)
• Can solve by iteration: vk+1 = M • vk
• We can compute this as a MapReduce job
![Page 33: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/33.jpg)
Defining the types
• Class PageInfo;
• Class LinksInfo extends PageInfo {
List<DocID> links;
}
• Class Increment extends PageInfo {
double inWeight;
}
![Page 34: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/34.jpg)
The logic Reduce(DocID key, Iterator<PageInfo> vals,
OutputCollector<DocID, PageInfo> output {
double total_score = 0;
LinksInfo info;
for ( PageInfo i: vals) {
if (i instanceof LinksInfo) {
info = (LinksInfo) vals.next();
output.collect(key, info)
} else
total_score += ((Increment) i).inWeight;
}
double s = total_score / info.links.size()
for (DocID out: links.links)
output.collect(key, Increment(s))
}
![Page 35: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/35.jpg)
Iterative Jobs are common… In
pu
t D
ata
Ou
tpu
t D
ata
Inp
ut
Dat
a
Ou
tpu
t D
ata
Wo
rkin
g Se
t
![Page 36: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/36.jpg)
Joins
Name ZIP Code
John Doe 08540
Richard Roe 20037
ZIP code State
08540 NJ
14850 NY
20037 DC
Name ZIP Code
State
John Doe
08540 NJ
Richard Roe
20037 DC
![Page 37: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/37.jpg)
Joins with MapReduce
Name ZIP Code
John Doe 08540
Richard Roe
20037
ZIP code State
08540 NJ
14850 NY
20037 DC
• If one table is small, just keep it in memory at every location and join in the Map method
• Can also join on Reduce side – Can emit whole contents of both tables in Map. – Use “join column” as sort key, then join in reduce().
• Higher-level languages help. (Pig, Hive, etc)
![Page 38: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/38.jpg)
Joins with MR, continued
• Class TableCell [could be int, string, etc]
• Class RowWithSource
map(NullWritable inKey, TableRow val …){
int fileId = getInputFileNumber();
int joinCol = config.get(“join_column_” +
fileId);
TableCell c = val.get(joinCol);
RowWithSource v2 =new RowWithSource(val, fileId);
output.collect(c, v2);
}
![Page 39: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/39.jpg)
Joins with MR, continued
Initialize joinCol1 and joinCol2 [class members] somewhere
reduce(TableCell key, Iterator <RowWithSource> values …){ List<RowWithSource> src1 = new List<RowWithSource> (); List<RowWithSource> src2 = new List<RowWithSource> ();
for (RowWithSource r: values) if (r.src == “1”) src1.append(r); else src2.append(r);
for (RowWithSource r1: src1) for (RowWithSource r2: src2) { TableRow res = join(r1, r2, joinCol1, joinCol2); output.collect(null, res); } }
![Page 40: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/40.jpg)
Observations
• Code is basically doing nested-loops over all pairs of rows which match on the join key.
• This doesn’t require materializing the whole set of results, but does materialize the sets of inputs on each side.
• This code would be a lot easier with product types (e.g. Pair<A,B>)
![Page 41: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/41.jpg)
What I work on
MapReduce is the Wrong Thing if the data is spread out: need more optimization to reduce wide-area transfer costs
![Page 42: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/42.jpg)
Deeper pipes for more locality
Consolidated results
W A
N
Partial results
Partial results
Extract Data
Extract Data L
A N
Partial results
Extract Data
L A
N
Extract Data
Extract Data
L A
N
Extract Data
![Page 43: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/43.jpg)
Take-aways
• Big data needs specialized tools to process.
• Higher-order functions help manage complexity.
• Determinism and the absence of side-effects make parallelism and failure recovery simpler.
• If you have complicated functionality, consider building a language
![Page 44: MapReduce: Programming in the Very Large](https://reader033.vdocument.in/reader033/viewer/2022052419/58a2bd4c1a28abbe5a8bab8e/html5/thumbnails/44.jpg)
For more information
• Hadoop is public and open source.
• See http://hadoop.apache.org for information.
• Amazon’s EC2 will let you run stuff at large scale for low (and incremental) costs.