high performance big data analytics › group › ahpcrc › hewittvisit › ...for big data ! goal:...

26
HIGH PERFORMANCE BIG DATA ANALYTICS Kunle Olukotun Electrical Engineering and Computer Science Stanford University June 2, 2014

Upload: others

Post on 06-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

HIGH PERFORMANCE BIG DATA ANALYTICS

Kunle  Olukotun    

Electrical  Engineering  and  Computer  Science  Stanford  University  

 

 

June  2,  2014  

Page 2: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Explosion of Data Sources

n  “DoD is swimming in sensors and drowning in data” n  Challenge: enable discovery

n  Deliver the capability to mine, search and analyze this data in near real time

Sensors

Page 3: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

PROCESSING SYSTEMS FOR BIG DATA

n  Goal: Make high-performance data analytics easy to use

n  Manipulate big data sets in real time n  Make better decisions, the right conclusions n  Streaming data n  Low latency computation n  Interactive data exploration

n  Requires the full power of modern computing platforms n  Heterogeneous parallelism

Page 4: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Processing PERFORMANCE TODAY

Exe

cuti

on T

ime

Rel

ativ

e to

C++

Benchmarks: fib, parse_int, quicksort, mandel, pi_sum, rand_mat_stat, and rand_mat_mul

Page 5: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

PERFORMANCE Vs. EASE OF USE

Performance

Eas

e of

use

Matlab, R

MapReduce ︎

Spark ︎

GraphLab ︎

Pig, Hive ︎

Goal

Page 6: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Heterogeneous parallelism

Cluster

Multicore CPU

Graphics Processing Unit (GPU)

Programmable Logic

Page 7: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

MPI MapReduce

Verilog VHDL

CUDA OpenCL

Threads OpenMP

Expert PARALLEL programming

Cluster

Multicore CPU

Graphics Processing Unit (GPU)

Programmable Logic

Page 8: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

8

Applications

Heterogeneous Hardware

New Arch.

THe programmability GAP

Data Wrangling

Graph Analysis

Prediction Recommendation

Data Transformation

Page 9: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Applications

Heterogeneous Hardware

New Arch.

Scala Java

Python

Ruby C++

Clojure

Not enough semantic knowledge to compile

automatically

No restrictions

general-purpose languages

Data Wrangling

Graph Analysis

Prediction Recommendation

Data Transformation

Page 10: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Domain Specific Languages n  Domain Specific Languages (DSLs)

n  Programming language with restricted expressiveness for a particular domain

n  High-level, usually declarative, and deterministic

Page 11: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

High Performance DSLs for Data Analytics

Data Wrangling

Graph Analysis

Prediction Recommendation

Data Transformation

Data Query OptiQL

Graph Alg. OptiGraph

Machine Learning OptiML

Convex Opt. OptiCVX

Data Transform

OptiWrangle

Applications

Domain Specific

Languages

Heterogeneous Hardware

DSL

Compiler

New Arch.

DSL

Compiler

DSL

Compiler

DSL

Compiler

DSL

Compiler

Page 12: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Scaling the DSL Approach n  Many potential DSLs

n  How do we quickly create high-performance implementations for DSLs we care about?

n  Enable expert parallel programmers to easily create new DSLs n  Make optimization knowledge reusable n  Simplify the compiler generation process

n  A few DSL developers enable many more DSL users n  Leave expert programming to experts!

Page 13: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Delite Common DSL Infrastructure

Delite: dSL infrastructure

Data Wrangling

Graph Analysis

Prediction Recommendation

Data Transformation

Data Query OptiQL

Graph Alg. OptiGraph

Machine Learning OptiML

Convex Opt. OptiCVX

Data Transform

OptiWrangle

Applications

Domain Specific

Languages

Heterogeneous Hardware

DSL

Compiler

New Arch.

DSL

Compiler

DSL

Compiler

DSL

Compiler

DSL

Compiler

Page 14: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Delite DSL

Framework

Delite: dSL infrastructure

Graph Analysis

Prediction Recommendation

Data Transformation

Data Query OptiQL

Graph Alg. OptiGraph

Machine Learning OptiML

Data Transform

OptiWrangle

Applications

Domain Specific

Languages

Heterogeneous Hardware

DSL

Compiler

DSL

Compiler

DSL

Compiler

DSL

Compiler

Multicore GPU FPGA Cluster

Page 15: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Delite Overview Key elements n  Scala libraries on

steroids

n  Intermediate representation

n  Domain specific optimization

n  General parallelism and locality optimizations

n  Mapping to HW targets

Op?{CVX,  Graph,  ML,  QL,  Wrangle}  

Generic  analyses    &  

transforma?ons  

Domain  specific  analyses  &  

transforma?ons  

Code  generators  

domain ops domain data

parallel data Parallel patterns

Threads   OpenMP   CUDA   OpenCL   MPI   Verilog  

DSL  User  

DSL  Developer  Delite  Fram

ework  

Page 16: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

optiMl: a Dsl for machine learning n  Designed for Iterative Statistical Inference

n  e.g. SVMs, logistic regression, neural networks, etc. n  Dense/sparse vectors and matrices, message-passing

graphs, training/test sets

n  Mostly Functional n  Data manipulation with classic functional operators (map,

filter) and ML-specific ones (sum, vector constructor, untilconverged)

n  Math with MATLAB-like syntax (a*b, chol(..), exp(..))

n  Runs Anywhere n  Single source to multicore CPUs, GPUs, and clusters

Page 17: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

K-MEANs CLUSTERING PERFORMANCE

Op,ML   Parallelized  MATLAB   C++  1.0  

1.9  

3.6  

5.2  

10.8  

0.3  

0.3  

0.3  

0.3  

0.3  

1.6  

0.0  

2.0  

4.0  

6.0  

8.0  

10.0  

12.0  

1  CPU   2  CPU   4  CPU   8  CPU   CPU  +  GPU  

Speedu

p  

K-­‐means  

untilconverged(kMeans,  tol){kMeans  =>      val  clusters  =  samples.groupRowsBy  {  sample  =>            val  allDistances  =  kMeans.mapRows  {  mean  =>                      dist(sample,  mean)                          }            allDistances.minIndex      }      val  newKmeans  =  clusters.map(e  =>  e.sum  /  e.length)      newKmeans  }  

Page 18: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

MSM Builder Using OptiML with Vijay Pande

!

Markov State Models (MSMs) MSMs are a powerful means of modeling the structure and dynamics of molecular systems, like proteins

x86 ASM

high prod, low perf

low prod, high perf

high prod, high perf

Page 19: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

OptiGraph n  A DSL for large-scale graph analysis based on Green-Marl

n  A DSL for Real-world Graph Analysis n  Green-Marl: A DSL for Easy and Efficient Graph Analysis (Hong et. al.), ASPLOS ’12

n  Functional DSL n  No mutation, no explicit iteration

n  Data structures n  Graph (directed, undirected), node, edge, n  Set of nodes, edges, neighbors, …

n  Graph traversals n  Summation, Breadth-first traversal, …

n  Parallel reductions but no deferred assignment

n  Example applications written in framework n  Betweeness Centrality n  Directed/Undirected Triangle Counting n  PageRank

Page 20: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

Example: Betweenness Centrality

n  Betweenness Centrality n  A measure that tells how ‘central’ a

node is in the graph n  Used in social network analysis n  Definition

n  How many shortest paths are there between any two nodes going through this node.

Ayush K. Kehdekar

Kevin Bacon

High BC Low BC

[Image source; Wikipedia]

Page 21: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

OptiGraph Betweenness Centrality

 1.  val  bc  =  sum(g.nodes){  s  =>  2.      val  sigma  =  g.inBfOrder(s)  {  (v,prev_sigma)  =>  if(v==s)  1    

3.          else  sum(v.upNbrs){  w  =>  prev_sigma(w)}  4.      }  5.      val  delta  =  g.inRevBfOrder(s){  (v,prev_delta)  =>  6.          sum(v.downNbrs){  w  =>  

7.              sigma(v)/sigma(w)*(1+prev_delta(w))  8.          }  9.      }  10.    delta  

11.  }

Page 22: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

OptiGraph: Page Rank  1.  val  pr  =  untilConverged(0,threshold){  oldPR  =>  2.      g.mapNodes{  n  =>    

3.          ((1.0-­‐damp)/g.numNodes)  +  damp*sum(n.inNbrs){  w  =>  4.          oldPr(w)/w.outDegree  5.          }  6.      }  

7.  }{(curPR,oldPR)  =>  sum(abs(curPr-­‐oldPr))}  

Page 23: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

OptiGraph vs. GPS (Pregel) Higher Productivity

Algorithm OptiGraph (lines of code)

Native GPS (lines of code)

Average Teenage Follower (AvgTeen) 13 130

PageRank 11 110

Conductance (Conduct) 12 149

Single Source Shortest Paths (SSSP) 29 105

Random Bipartite Matching (Bipartite) 47 225

Approximate Betweeness Centrality 25 Not Available

DSL compiler automatically converts OptiGraph to GPS

Page 24: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

OptiGraph vs. Native GPS (Pregel) Similar Performance

25 x 4 = 100 cores

OptiGraph execution time relative to native GPS

Page 25: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

TRIANGLE COUNTING n  Microsoft Research, IBM, Logic Blox, MIT, and Oracle are trying

to make this application run fast n  Simple yet practical example of multi-way join–a fundamental

operation to any data analytics engine n  Serves as the building block to many graph mining applications

such as identifying cliques, graph transitivity, and clustering coefficients  

 1.  val  triangleCount  =  g.sumOverNodes{  n  =>  2.    sum(n.nbrs){nbr  =>  n.commonNbrs(nbr).size  }  3.  }  

Page 26: HIGH PERFORMANCE BIG DATA ANALYTICS › group › ahpcrc › HewittVisit › ...FOR BIG DATA ! Goal: Make high-performance data analytics easy to use ! Manipulate big data sets in

KeyS to TRIANGLE COUNTING PERFORMANCE

n  Data layout n  Hash n  Compressed sparse row n  Bit set n  Compressed bit set

n  Dynamically switch based on graph characteristics n  E.g. mesh vs. scale free (power law) n  Mixed layouts are best in some cases

n  Dynamic scheduling for load balance n  Performance range: 2x–160x better than Graphlab