04/27/2011dht1 operating system ecs251 spring 2011 : operating system #6: map and reduce dr. s....

04/27/2011 DHT 1

ecs251 Spring 2011:Operating SystemOperating System#6: Map and Reduce

Dr. S. Felix Wu

Computer Science Department

University of California, Davis

http://www.facebook.com/group.php?gid=29670204725

http://cyrus.cs.ucdavis.edu/~wu/ecs251

Programming ModelProgramming Model

Input-key\value pair Output- key\value pair MapReduce Library contains 2 functions:

Map Reduce

Input key\value pair Intermediate key\value pair MapReduce library groups all intermediate values with the

same intermediate key I Intermediate key I Smaller set of values and values for I

REDUCE

MapReduce : ExampleMapReduce : Example

Counting number of occurrences of each word in a large collection of documents.

doc name & doc contents word & its

occurrences word & list of counts sum of all counts for word Input and output types:

map(k1,v1) list(k2,v2)

reduce(k2,list(v2)) list(v2)

REDUCE

MapReduce : ExecutionMapReduce : Execution

04/27/2011 DHT 5

04/27/2011 DHT 6

SecondaryNameNode

Client

HDFS Architecture

NameNode

DataNodes

1. filename

2. BlckId, DataNodes

3.Read data

Cluster Membership

NameNode : Maps a file to a file-id and list of MapNodesDataNode : Maps a block-id to a physical location on diskSecondaryNameNode: Periodic merge of Transaction log

04/27/2011 DHT 7

Map and ReduceMap and Reduce

The idea of Map, and Reduce is 40+ year old– Present in all Functional Programming Languages.

– See, e.g., APL, Lisp and ML

Alternate names for Map: Apply-All Higher Order Functions

– take function definitions as arguments, or

– return a function as output

Map and Reduce are higher-order functions.

04/27/2011 DHT 8

04/27/2011 DHT 9

GFS: Google File SystemGFS: Google File System

“failures” are norm Multiple-GB files are common Append rather than overwrite

– Random writes are rare Can we relax the consistency?

04/27/2011 DHT 10

04/27/2011 DHT 11

# an input reader# a Map function# a partition function# a compare function# a Reduce function# an output write

04/27/2011 DHT 12

Map: A Higher Order Map: A Higher Order FunctionFunction

F(x: int) returns r: int Let V be an array of integers. W = map(F, V)

– W[i] = F(V[i]) for all I– i.e., apply F to every element of V

04/27/2011 DHT 13

Map Examples in Map Examples in HaskellHaskell

map (+1) [1,2,3,4,5]== [2, 3, 4, 5, 6]

map (toLower) "abcDEFG12!@#“== "abcdefg12!@#“

map (`mod` 3) [1..10]== [1, 2, 0, 1, 2, 0, 1, 2, 0, 1]

04/27/2011 DHT 14

Word Count ExampleWord Count Example

Read text files and count how often words occur. – The input is text files– The output is a text file

each line: word, tab, count

Map: Produce pairs of (word, count) Reduce: For each word, sum up the counts.

04/27/2011 DHT 15

I am a tiger, you are also a tiger a,2 also,1 am,1 are,1 I,1 tiger,2 you,1

I,1 am,1 a,1

tiger,1 you,1 are,1

also,1 a, 1 tiger,1

a,2also,1am,1 are,1

I, 1 tiger,2 you,1

reduce

a, 1 a,1 also,1 am,1 are,1 I,1 tiger,1 tiger,1 you,1

04/27/2011 DHT 16

Grep ExampleGrep Example

Search input files for a given patternMap: emits a line if pattern is matchedReduce: Copies results to output

04/27/2011 DHT 17

Inverted Index ExampleInverted Index Example

Generate an inverted index of words from a given set of files

Map: parses a document and emits <word, docId> pairs

Reduce: takes all pairs for a given word, sorts the docId values, and emits a <word, list(docId)> pair

04/27/2011 DHT 18

Execution on ClustersExecution on Clusters

1. Input files split (M splits)

2. Assign Master & Workers

3. Map tasks

4. Writing intermediate data to disk (R regions)

5. Intermediate data read & sort

6. Reduce tasks

7. Return

04/27/2011 DHT 19

<Key, Value> Pair

Reduce

Output

Row Data

key values

Reduce

key1 valkey2 valkey1 val… …

Output

Select Key

key1 val val…. val

04/27/2011 DHT 20

split 0

split 1

split 2

split 3

split 4

reduce

reduce part1

inputHDFS

sort/copymerge

outputHDFS

04/27/2011 DHT 21

04/27/2011 DHT 22

04/27/2011 DHT 23

Class MR{Class Mapper …{ }Class Reducer …{ }main(){

JobConf conf = new JobConf(“MR.class”);conf.setMapperClass(Mapper.class);conf.setReduceClass(Reducer.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));FileOutputFormat.setOutputPath(conf, new

Path(args[1]));

JobClient.runJob(conf);}}

Map function

Reduce function

Other parts of program

Reduce

Config

04/27/2011 DHT 24 24

class MyMap extends MapReduceBase implements Mapper < , , , > {// global variables

public void map ( key, value, OutputCollector< , > output,

Reporter reporter) throws IOException {// local variables and programoutput.collect( NewKey, NewValue);}

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

04/27/2011 DHT 25

class MyRed extends MapReduceBase implements Reducer < , , , > {// global variables

public void reduce ( key, Iterator< > values, OutputCollector< , > output,

Reporter reporter) throws IOException {// local variables and programoutput.collect( NewKey, NewValue);}

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

04/27/2011 DHT 26

04/27/2011 DHT 27

04/27/2011 DHT 28

04/27/2011 DHT 29

• Complete web search engine– Nutch = Crawler + Indexer/Searcher (Lucene)

+ GUI » +Plugins» +MapReduce & Distributed FS (Hadoop)

• Java based, open source, many customizable scripts available at (http://lucene.apache.org/nutch/)

• Features:– Customizable– Extensible (e.g. extend to Solr for enhanced

portability)

04/27/2011 DHT 30

04/27/2011 DHT 31

Data Structures used by Nutch

• Web Database or WebDB– Mirrors the properties/structure of web graph

being crawled

• Segment– Intermediate index– Contains pages fetched in a single run

• Index– Final inverted index obtained by “merging”

segments (Lucene)

04/27/2011 DHT 32

• Customized graph database• Used by Crawler only• Persistent storage for “pages” & “links”

– Page DB: Indexed by URL and hash; contains content, outlinks, fetch information & score

– Link DB: contains “source to target” links, anchor text

04/27/2011 DHT 33

Crawling

• Cyclic process– crawler generates a set of fetchlists from the

WebDB– fetchers downloads the content from the Web– the crawler updates the WebDB with new links

that were found– and then the crawler generates a new set of

fetchlists– And Repeat till you reach the “depth”

04/27/2011 DHT 34

Indexing

• Iterate through all k page sets in parallel, constructing inverted index

• Creates a “searchable document” of:– URL text– Content text– Incoming anchor text

• Other content types might have a different document fields– Eg, email has sender/receiver– Any searchable field end-user will want

• Uses Lucene text indexer

04/27/2011 DHT 35

Lucene

• Open source search project– http://lucene.apache.org

• Index & search local files– Download lucene-2.2.0.tar.gz from

http://www.apache.org/dyn/closer.cgi/lucene/java/

– Extract files

– Build an index for a directory

• java org.apache.lucene.demo.IndexFiles dir_path

– Try search at command line:

• java org.apache.lucene.demo.SearchFiles

04/27/2011 DHT 36

Lucene’s Open ArchitectureLucene’s Open Architecture

Spring 2008 36

File System

IMAPServer

FS Crawler

PDFHTMLDOCTXT…

TXTparser

PDFparser

HTMLparser

LuceneDocu-ments

StopAnalyzer

CN/DE/Analyzer

StandardAnalyzer

indexer

Crawling Parsing Indexing

Searching

Lucene

04/27/2011 DHT 37

Document

Name Value

04/27/2011 DHT 38

•Create an Analyser•WhitespaceAnalyzer

–divides text at whitespace

•SimpleAnalyzer–divides text at non-letters–convert to lower case

•StopAnalyzer–SimpleAnalyzer– removes stop words

•StandardAnalyzer–good for most European Languages– removes stop words–convert to lower case

04/27/2011 DHT 39

04/27/2011 DHT 40

Inverted Index (Inverted File)

Doc 1:

Penn State Football …

football

Doc 2:

Football players … State

Postingid

word doc offset

1 football Doc 1 3

Doc 1 67

Doc 2 1

2 penn Doc 1 1

3 players Doc 2 2

4 state Doc 1 2

Doc 2 13

PostingTable

04/27/2011 DHT 41

Term Dictionary(Random file access)

Term Info Index(in Memory)

Constant time

Frequency File(Random file

access)

Position File(Random file access)

Constant time

Field info(in Memory)

Constant time

04/27/2011 DHT 42

Map/Reduce Cluster Map/Reduce Cluster ImplementationImplementation

split 0split 1split 2split 3split 4

Output 0

Output 1

Input files

Output files

M map tasks

R reduce tasks

Intermediate files

Several map or reduce tasks can run on a single computer

Each intermediate file is divided into R partitions, by partitioning function

Each reduce task corresponds to one partition

04/27/2011 DHT 43

ExecutionExecution

04/27/2011 DHT 44

Hadoop Usage at Hadoop Usage at FacebookFacebook

Data warehouse running Hive 600 machines, 4800 cores, 2.4 PB disk 3200 jobs per day 50+ engineers have used Hadoop

04/27/2011 DHT 45

Facebook Data PipelineFacebook Data Pipeline

Web Servers Scribe Servers

Network Storage

Hadoop Cluster

Oracle RACMySQL

Analysts

Hive Queries

Summaries

04/27/2011 DHT 46

Facebook Job TypesFacebook Job Types

Production jobs: load data, compute statistics, detect spam, etc

Long experiments: machine learning, etc Small ad-hoc queries: Hive jobs, sampling

GOAL: Provide fast response times for small jobs and guaranteed service levels for production jobs

04/27/2011 DHT 47

04/27/2011 DHT 48

Cloud Computing Cloud Computing SchedulingScheduling

FIFO, Fair-Sharing Job scheduling with “constraints”

– Dependency– Priority-oriented– Soft Deadline

04/27/2011 DHT 49

HiveHive Developed at Facebook Used for majority of Facebook jobs “Relational database” built on Hadoop

– Maintains list of table schemas– SQL-like query language (HQL)– Can call Hadoop Streaming scripts from HQL– Supports table partitioning, clustering, complex

data types, some optimizations

04/27/2011 DHT 50

Creating a Hive TableCreating a Hive Table

CREATE TABLE page_views(viewTime INT, userid BIGINT, page_url STRING, referrer_url STRING, ip STRING COMMENT 'User IP address') COMMENT 'This is the page view table' PARTITIONED BY(dt STRING, country STRING)STORED AS SEQUENCEFILE;

• Partitioning breaks table into separate files for each (dt, country) pairEx: /hive/page_view/dt=2008-06-08,country=US /hive/page_view/dt=2008-06-08,country=CA

04/27/2011 DHT 51

Simple QuerySimple Query

SELECT page_views.* FROM page_views WHERE page_views.date >= '2008-03-01'AND page_views.date <= '2008-03-31'AND page_views.referrer_url like '%xyz.com';

• Hive only reads partition 2008-03-01,* instead of scanning entire table

• Find all page views coming from xyz.com on March 31st:

04/27/2011 DHT 52

Aggregation and JoinsAggregation and Joins

SELECT pv.page_url, u.gender, COUNT(DISTINCT u.id)FROM page_views pv JOIN user u ON (pv.userid = u.id)GROUP BY pv.page_url, u.genderWHERE pv.date = '2008-03-03';

• Count users who visited each page by gender:

• Sample output:

04/27/2011dht1 operating system ecs251 spring 2011 : operating system #6: map and reduce dr. s....

Documents

ucdavis, ecs251 spring 2007 05/03/2007p2p1 operating system...

03/30/2006ecs150 spring 20061 operating system ecs150 spring...

osgi with apache felix karaf - with apache felix...

maria felix

ucdavis, ecs251 spring 2007 04/17/2007transactions1...

felix matathias

felix baungartner

felix furtniture

felix mayer

felix candela

felix project

ucdavis, ecs251 fall 2007 09/27/2007file systems1 operating...

felix juan12

ucdavis, ecs150 spring 2006 05/31/2006ecs150, spring 20061...

felix hc90

felix petersen

australia felix

10/09/2007ecs150 fall 20071 operating system ecs150 fall...

04/27/2011dht1 operating system ecs251 winter 2013 :...

electronica felix