data parallel and graph parallel systems for large-scal e data p rocessing

43
Data Parallel and Graph Parallel Systems for Large-scale Data Processing Presenter: Kun Li

Upload: ruby-rios

Post on 03-Jan-2016

29 views

Category:

Documents


1 download

DESCRIPTION

Data Parallel and Graph Parallel Systems for Large-scal e Data P rocessing. Presenter: Kun Li. Threads, Locks, and Messages. ML experts repeatedly solve the same parallel design challenges: Implement and debug complex parallel system Tune for a specific parallel platform - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Data Parallel and Graph Parallel Systems for Large-scale Data Processing

Presenter: Kun Li

Page 2: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Threads, Locks, and Messages• ML experts repeatedly solve the same parallel

design challenges:– Implement and debug complex parallel system– Tune for a specific parallel platform– Two months later the conference paper contains:

“We implemented ______ in parallel.”• The resulting code:– is difficult to maintain– is difficult to extend– couples learning model to parallel implementation

2

Page 3: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Map-Reduce / Hadoop

Build learning algorithms on-top of high-level parallel abstractions

... a better answer:

Page 4: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Motivation

• Large-Scale Data Processing– Want to use 1000s of CPUs• But don’t want hassle of managing things

• MapReduce provides– Automatic parallelization & distribution– Fault tolerance– I/O scheduling– Monitoring & status updates

Page 5: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Map/Reduce• map(key, val) is run on each item in set– emits new-key / new-val pairs

• reduce(key, vals) is run for each unique key emitted by map()– emits final output

Page 6: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Count count indocs

map(key=url, val=contents):For each word w in contents, emit (w, “1”)

reduce(key=word, values=uniq_counts):Sum all “1”s in values listEmit result “(word, sum)”

see bob throwsee spot run

see 1bob 1 run 1see 1spot 1throw 1

bob 1 run 1see 2spot 1throw 1

Page 7: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Grep

– Input consists of (url+offset, single line)– map(key=url+offset, val=line):• If contents matches regexp, emit (line, “1”)

– reduce(key=line, values=uniq_counts):• Don’t do anything; just emit line

Page 8: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Reverse Web-Link Graph

• Map– For each URL linking to target, …– Output <target, source> pairs

• Reduce– Concatenate list of all source URLs– Outputs: <target, list (source)> pairs

Page 9: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Job Processing

JobTracker

TaskTracker 0TaskTracker 1 TaskTracker 2

TaskTracker 3 TaskTracker 4 TaskTracker 5

1. Client submits “grep” job, indicating code and input files

2. JobTracker breaks input file into k chunks, (in this case 6). Assigns work to ttrackers.

3. After map(), tasktrackers exchange map-output to build reduce() keyspace

4. JobTracker breaks reduce() keyspace into m chunks (in this case 6). Assigns work.

5. reduce() output may go to NDFS

“grep”

Page 10: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Execution

Page 11: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Parallel Execution

Page 12: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing
Page 13: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing
Page 14: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing
Page 15: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing
Page 16: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing
Page 17: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing
Page 18: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Refinement: Locality Optimization

• Master scheduling policy: – Asks GFS for locations of replicas of input file blocks – Map tasks scheduled so GFS input block replica are on same

machine or same rack

• Effect– Thousands of machines read input at local disk speed

• Without this, rack switches limit read rate

• Combiner– Useful for saving network bandwidth

Page 19: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

BeliefPropagation

Label Propagation

KernelMethods

Deep BeliefNetworks

NeuralNetworks

Tensor Factorization

PageRank

Lasso

Map-Reduce for Data-Parallel ML• Excellent for large data-parallel tasks!

19

Data-ParallelGraph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Is there more toMachine Learning

?

Page 20: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Properties of Graph Parallel Algorithms

DependencyGraph

IterativeComputation

What I Like

What My Friends Like

Factored Computation

Page 21: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

?

BeliefPropagation

Label Propagation

KernelMethods

Deep BeliefNetworks

NeuralNetworks

Tensor Factorization

PageRank

Lasso

Map-Reduce for Data-Parallel ML• Excellent for large data-parallel tasks!

21

Data-ParallelGraph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Map Reduce?

Page 22: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Why not use Map-Reducefor

Graph Parallel Algorithms?

Page 23: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Data Dependencies• Map-Reduce does not efficiently express

dependent data– User must code substantial data transformations – Costly data replication

Inde

pend

ent D

ata

Row

s

Page 24: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Slow

Proc

esso

rIterative Algorithms

• Map-Reduce not efficiently express iterative algorithms:

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Iterations

Barr

ier

Barr

ier

Barr

ier

Page 25: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

MapAbuse: Iterative MapReduce• Only a subset of data needs computation:

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Iterations

Barr

ier

Barr

ier

Barr

ier

Page 26: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

MapAbuse: Iterative MapReduce• System is not optimized for iteration:

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Iterations

Disk Pe

nalty

Disk Pe

nalty

Disk Pe

nalty

Sta

rtup

Pen

alty

Sta

rtup

Pen

alty

Sta

rtup

Pen

alty

Page 27: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

BeliefPropagation

SVM

KernelMethods

Deep BeliefNetworks

NeuralNetworks

Tensor Factorization

PageRank

Lasso

Map-Reduce for Data-Parallel ML• Excellent for large data-parallel tasks!

27

Data-ParallelGraph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Map Reduce?GraphLab

Page 28: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

The GraphLab Framework

Scheduler Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

28

Page 29: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Data Graph

29

A graph with arbitrary data (C++ Objects) associated with each vertex and edge.

Vertex Data:•User profile text• Current interests estimates

Edge Data:• Similarity weights

Graph:• Social Network

Page 30: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Implementing the Data GraphMulticore Setting

• In Memory• Relatively Straight Forward

– vertex_data(vid) data– edge_data(vid,vid) data– neighbors(vid) vid_list

• Challenge:– Fast lookup, low overhead

• Solution:– Dense data-structures– Fixed Vdata&Edata types– Immutable graph structure

Page 31: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

The GraphLab Framework

Scheduler Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

31

Page 32: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

label_prop(i, scope){// Get Neighborhood data (Likes[i], Wij, Likes[j]) scope;

// Update the vertex data

// Reschedule Neighbors if needed if Likes[i] changes then reschedule_neighbors_of(i); }

Update Functions

32

An update function is a user defined program which when applied to a vertex transforms the data in the scopeof the vertex

Page 33: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

The GraphLab Framework

Scheduler Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

33

Page 34: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

The Scheduler

34

CPU 1

CPU 2

The scheduler determines the order that vertices are updated.

e f g

kjih

dcba b

ih

a

i

b e f

j

c

Sch

edule

r

The process repeats until the scheduler is empty.

Page 35: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

The GraphLab Framework

Scheduler Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

36

Page 36: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Ensuring Race-Free Code• How much can computation overlap?

Page 37: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

GraphLab Ensures Sequential Consistency

38

For each parallel execution, there exists a sequential execution of update functions which produces the same result.

CPU 1

CPU 2

SingleCPU

Parallel

Sequential

time

Page 38: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Consistency Rules

40

Guaranteed sequential consistency for all update functions

Data

Page 39: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Full Consistency

41

Page 40: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Obtaining More Parallelism

42

Page 41: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Edge Consistency

43

CPU 1 CPU 2

Safe

Read

Page 42: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Consistency Through R/W Locks• Read/Write locks:– Full Consistency

– Edge Consistency

Write Write WriteCanonical Lock Ordering

Read Write ReadRead Write

Page 43: Data Parallel and Graph Parallel  Systems for Large-scal e Data  P rocessing

Consistency Through Scheduling• Edge Consistency Model:– Two vertices can be Updated simultaneously if they do

not share an edge.• Graph Coloring:– Two vertices can be assigned the same color if they do

not share an edge.

Barr

ier

Phase 1

Barr

ier

Phase 2

Barr

ier

Phase 3