big data ecosystem & the stratosphere project

StratoSphereAbove the Clouds

Stratosphere

Massively Parallel Analytics

Alexander Alexandrov, Stephan Ewen,Joseph Harjung, Fabian Hüske,

Moritz Kaufmann, Aljoscha Krettek, Volker Markl, Kostas Tzoumas, Sebastian Schelter

Stratosphere – Parallel Analytics Beyond MapReduce

The Big Data Context

Large Quantitiesof Data

Diverse Data Structures

Complex AnalysisTasks

SQL NoSQL

NoMapReduce

SQL NoSQL

NoMapReduce

SQL NoSQL

NoMapReduce

SQL NoSQL

NoMapReduce

SQL NoSQL

?Question 1:

Is it faster to add a HiveQL parser and

an HDFS adapter to your favorite

parallel database, or develop a parallel

engine from scratch?

NoMapReduce

SQL NoSQL

?Question 1:

Is it faster to add a HiveQL parser and

an HDFS adapter to your favorite

parallel database, or develop a parallel

engine from scratch?

Question 2:Have we closed the circle (“we want

SQL!”) or is there more in analytics?

scripting

XQuery+/-

scripting

scalable parallel sort

XQuery+/-

scripting

XQuery+/- not a sortingproblem!

scripting

columnstore--

scripting

columnstore--

a queryplan

scripting

columnstore--

a queryplan

Question 3:

How do we architect systems for the

next wave of rich data analysis?

commandments

for Big Data

Analytics

case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)

val vertices = hdfsFile(…);val edges = hdfsFile(…);

val result = step iterate (vertices distinctBy {_.id}, vertices)

def step = (s: Data[Vertex], ws: Data[Vertex]) => {

val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}

val min = allNeighbors reduceBy {_.id} ( minBy _.component)

val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}

(I) Thou shalt…

… use declarative languages!

Stratosphere – Parallel Analytics Beyond MapReduce22

(I) Thou shalt…

… use declarative languages!

Executive Summary

Connected components of a graph.

- Joins and aggregations on custom data types

- Incremental / Delta Iterations

- Mixture of operators and UDFs

(II) Thou shalt…

… accept external (dynamic) sources! “In situ” data - no load

(III) Thou shalt…

… use rich primitives! (beyond MapReduce)

(III) Thou shalt…

… use rich primitives! (beyond MapReduce)

Reduce

CoGroup

(IV) Thou shalt…

… define queries and UDFs in the same language!

Query definition

(V) Thou shalt…

… use an algebraic butrich data model!

Custom Object Oriented andFunctional Data Types

Use functions as referencesto fields/attributes

(VI) Thou shalt…

… optimize! Auto-parallelization and optimization à la relational databases.

(VII) Thou shalt…

… not treat UDFs as black boxes!

Static code analysis of UDFsto determine field accessesand modificationsVastly increases optimization

potential

(VIII) Thou shalt…

… iterate/recurse!

Step function

Needed for most interesting analysis cases

(IX) Thou shalt…

… exploit dynamic computation!

Naïve (Bulk)

Incremental

0200000400000600000800000

100000012000001400000

Superstep

Pregel as a Stratosphere plan with comparable performance.

(X) Thou shalt…

… use a scalable and efficient execution engine!

Pipeline and data parallelism, flexible checkpointing, optimized network data transfers

Write like a programming language

Execute like a Database

Write like a programming language

Execute like a DatabaseAdd a bit of "languages and compilers" sauce to the database stack…

Stratosphere Programming Stack

Nephele Dataflow Engine

Runtime Operators

SOPREMOCompiler

MeteorScript

Scala-Compiler Plugin

Stratosphere Optimizer

Nephele Parallel Dataflow

PACT Program

Layered approach – several entry points to the system

Stratosphere Programming Stack

Nephele Dataflow Engine

Runtime Operators

SOPREMOCompiler

MeteorScript

Scala-Compiler Plugin

Stratosphere Optimizer

Nephele Parallel Dataflow

PACT Program

Pact programScala program

Scala compiler plug-in

RuntimeHash- and sort-based out-of-core operator implementations, memory management

Stratosphere optimizerPicks data shipping and local strategies, operator order

Execution plan

Nephele Execution EngineTask scheduling, network data transfers, resource allocation, checkpointing

Job graph Execution graph

Pact programScala program

Scala compiler plug-in

RuntimeHash- and sort-based out-of-core operator implementations, memory management

Stratosphere optimizerPicks data shipping and local strategies, operator order

Execution plan

Nephele Execution EngineTask scheduling, network data transfers, resource allocation, checkpointing

Job graph Execution graph

PARALLEL PROGRAMMING MODEL

Part 1

Background: PACTs

D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, D. Warneke: Nephele/PACTs: a programming model and execution framework for web-scale analytical processing

Second-orderfunction

First-order function(UDF)Data Data

Map Reduce Cross Match CoGroup

■ Data flow operators (UDFs)are first-order functions

■ Application of UDFs to thedata through second-orderfunctions that defineparallel semantics

■ Declarative, as executionstrategies are not fixed

Background: PACTs

Reduce (on A)sum(B), avg(C)

Match (A = D)if (A>3) emit

MapC := max(A,B)

Mapif (D>4) emit

Sink 1

Source 1Extract (A,B)

Source 2Extract (D,E)

D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, D. Warneke: Nephele/PACTs: a programming model and execution framework for web-scale analytical processing

Iterative Programs

S. Ewen, K. Tzoumas, M. Kaufmann, V. Markl:Spinning Fast Iterative Data Flows. PVLDB 5(11), 2012

(v2, cid) Match

(v1,v2), (vid,cid)

(vid, cid)CoGroup

[(vid,cid)],(vid, cid)

Wi+1 Di+1

Bulk Iteration(Page Rank)

Incremental Iteration(Connected Components)

(pid, tid, p)

Join Pand A

(pid, r)

Reduce (on tid)(pid=tid, r=∑ k)

Match (on pid)(tid, k=r*p)

Sum uppartial ranks

How does it look in code

val result = step iterate (vertices distinctBy {_.id}, messages)

def step = (s: Data[Vertex], ws: Data[Message]) => { val sNext = ws join s on {…} isEqualTo {…} using {…} val wNext = sNext join edges on … (sNext, wNext)}

Incremental Iterations matter…

0 3 6 9 12 15 18 21 24 27 30 330

200000

400000

600000

800000

1000000

1200000

1400000

Superstep

Naïve (Bulk)

Incremental

Twitter Webbase (20)0

Changes to the iteration's result for Connected Components in each superstep…

… and runtime.

Pregel as a Pact program

THE PROGRAM COMPILER AND OPTIMIZER

Part 2

Why an Optimizer for such Programs?

Do you want to hand-optimize that?

■ Cost-based optimizer produces physical execution plan given PACT program□ Annotates data channels with distribution patters, e.g., broadcast, partition□ Chooses physical execution strategies (e.g., hash/sort)□ Reorders PACT functions Deeply embeds MapReduce style UDFs in the

optimization

■ Optimization of iterative programs□ Passing data between super-steps□ Loop-invariant data□ Efficient state maintenance in partitioned indexes

■ Challenge: Semantics of user-defined functions unknown

Pact Optimizer Overview

Current architecture

1) Analyze 3) Parallelize

2) Reorder

1) Opening the Black Boxes …

Analyze user code to discover:

■ Read set Rf: Attributes of the input record(s) that might influence output

■ Write set Wf: Attributes of the output record(s) that might have different values from respective input attributes

■ Emit cardinality Ef: Bounds on records emitted per call (1, >1, …)

(Rf,Wf,Ef)

1 void match (Record left,2 Record right,3 Collector col) {4 Record out = copy (left);5 if (left.get(0) > 3) {6 double a = right.get(2);7 out.set(2,1.0/a);8 }9 out.set(1, 42);10 out.set(3,right.get(0));11 out.set(4,right.get(1));12 out.set(5,right.get(2));13 col.emit (out);14 }

… via Static Code Analysis

Feasible:1. No control flow between

operators 2. Record data model, fixed API

Correct: ■ Difficulty comes from different code

paths■ Correctness guaranteed through

conservatism■ Add to R,W when in doubt

Conditions for reordering UDFs

Enabled optimizations: Selection push-down (Bushy) join reordering Aggregation push-down

Equivalent to invariant grouping transformation [Chaudhuri & Shim 1994]

Reordering of non-relational Reduce functions

Theorem 1: Two Map operators can be reordered if their UDFs have only read-read conflictsTheorem 2: For a Map and a Reduce, we need in addition the Reduce key groups to be preserved

■ Simple enumeration algorithm that checks pairwise reordering for all neighboring operators

■ Current problem: Walking all points in the search space

■ Next: Deduce join-graph-like information from reordering degrees-of-freedom

Optimizer Architecture (I)

■ Operators are defined in terms of possible global data properties (partitioning/replication/...) and local data properties (order/grouping/uniqueness/...)

■ Nodes propagate requested properties top-down□ Filtered by UDF‘s field modification□ Filtered by incompatibility□ Every data flow edge has a set of possible requested properties

■ Requested properties are instantiated at each point□ Global properties by exchange strategies□ Local properties by local operators

■ Requested properties used for pruning candidate (as with intersting properties)

Optimizer Architecture (II)

■ Determine static and dynamic data flow paths for iterations□ Static path contains data that is loop-invariant

■ Use heuristics to place caches such that loop-invariant computations are not repeated□ Cache loop-invariant data also in ordered form, or as hash tables

■ Weigh costs for static and dynamic path differently□ Optimizer favors plans that „push“ work into static path

Optimizer Architecture (III)

PageRank: Two Optimizer Plans

I(pid, tid, p)

Join P and A

Sum uppartial ranks

(pid, r)

Abroadcast

part./sort (tid)

probeHashTable (pid)buildHash-Table (pid)

I(pid, tid, p)

buildHashTable (pid)

Join P and A

(pid, r)

part./sort (tid)

partition (pid)

CACHEprobeHash-Table (pid)

Sum uppartial ranks

ppartition (pid)

THE FUNCTIONAL LANGUAGE COMPILATION

Part 3

The Compiler Mismatch

Parser/Checker Optimizer Code

Generation Runtime

Parser/Checker

Code Generation Optimizer Runtime

The Database Approach

UDF Systems: MapReduce &Stratosphere (original)

Code Generation AFTERcontext of operation is fixed.

Code Generation BEFOREcontext of operation is fixed.

Query Compiler

Language Compiler

The Program Compilation Pipeline

Program Code

Parser/Checker

ByteCode

Generator

Analyzer and Code

Generator

GlobalSchema

Generator

PactOptimizer

ProgramInstantiation

Schema and Code

Finalization

Parallel Data Flow

Generator

Parallel Data Flow

Language Compiler

■ Supported Types□ Primitive (Integers, Floating-Point, Strings, …), Lists, Tuples, Product Types

(classes), Summation Types (class hierarchies) , Recursive Types

■ Data types are logically flattened□ Some fields are transparent members of the flat model, some are black box

members

■ Transparent members may be references in selector functions

■ Selector Functions are likewise analyzed and translated into logical positions

1) Analyzing Data Types

■ User Code is pure Scala, no Stratosphere specific types, interfaces

■ Wrapper code necessary to run it as a UDF in Stratosphere

■ Serializer/Comparator Code is generated as a template (omitting exact field positions, storing logical positions)

■ Code is inserted by modifying the program's Abstract-Syntax-Tree

2) Generating Glue Code

■ Schema generated from logical flattened model■ Each field in every operator’s result gets a unique name

□ Unless exact copy of an input field (info from code analysis)

■ Run Stratosphere optimizer□ Potentially reorders functions

■ Prune unused fields early□ Information whether fields are accessed by UDF from code analysis

■ Create physical data layout■ Finalize serializer / comparator code

3) Schema Generation

Some preliminary results...

■ MapReduce ■ Pig, JAQL, Hive■ AQL■ Scope■ Datalog for Machine Learning■ BOOM■ Twister / HaLoop■ Spark■ Naiad■ Flume Java / Plume Java■ Scalops■ Jet■ LINQ

Related Work

big data ecosystem & the stratosphere project

int val vertices

component val s1

component o

val neighbors

hdfsfile val edges

hdfsfile val result

component somec

int case class edgefrom

Technology

the "big data" ecosystem at linkedin

the big data analytics ecosystem at linkedin

stratosphere system overview big data beers berlin....

the stratosphere big data analytics platform

ebook: data, all about big data ecosystem

the big data ecosystem at linkedin

unit 5: big data ecosystem - colvee

apesnerds.weebly.comapesnerds.weebly.com/.../climate_change_and_ozone_… ·...

big data and hadoop ecosystem tools

stratosphere tower

stratosphere troposphere. stratosphere troposphere

the ecosystem is too damn big

the stratosphere platform for big data analytics · the...

standard enterprise big data ecosystem · pdf filestandard...

stratosphere - next generation big data analytics platform...

real time big data applications with hadoop ecosystem

ebook: data, all about big data ecosystem (english)

defining architecture components of the big data ecosystem

big data analytics ecosystem ww idc

1hp confidential the big data ecosystem and you!