dr. kostas tzoumas: big data looks tiny from stratosphere at big data beers (nov. 20, 2013)

Big Data looks tiny from

StratosphereKostas Tzoumas

[email protected]

mailto:[email protected]

mailto:[email protected]

Data is an important assetvideo & audio streams, sensor data, RFID, GPS, user online

behavior, scientific simulations, web archives, ...

VolumeHandle petabytes of data

VelocityHandle high data arrival rates

VarietyHandle many heterogeneous data sources

VeracityHandle inherent uncertainty of data2

3

Data

Analysis

Four “I”s for Big Analysistext mining, interactive and ad hoc analysis, machine

learning, graph analysis, statistical algorithms

IterativeModel the data, do not just describe it

IncrementalMaintain the model under high arrival rates

InteractiveStep-by-step data exploration on very large data

IntegrativeFluent unified interfaces for different data models4

5

HadoopHadoop’s selling point is its low effective storage cost.

Hadoop clusters are becoming a data vortex, attracting cross-departmental data and changing the data usage culture in companies.

Hadoop MapReduce was the wrong abstraction and implementation to begin with and will be superseded by better systems.

6

Advanced AnalyticsAnalytics that model the data to reveal hidden relationships, not just describe the data.

E.g., machine learning, predictive stats, graph analysis

Increasingly important from a market perspective.

Very different than SQL analytics: different languages and access patterns (iterative vs. one-pass programs).

Hadoop toolchain poor; R, Matlab, etc not parallel.

MapReduce

NoMapReduce

SQL

BigSQL

BigAnalytics

8

scriptingSQL--

columnstore++

scalable parallel sort

a queryplan

XQuery? wrongplatform

9

Data Scientist: The Sexiest Job of the 21st Century

Meet the people who can coax treasure out of messy, unstructured data. by Thomas H. Davenport and D.J. Patil

ARTWORK Tamar Cohen, Andrew J Buboltz 2011, silk screen on a page from a high school yearbook, 8.5" x 12"

Spotlight

hen Jonathan Goldman ar-rived for work in June 2006

at LinkedIn, the business networking site, the place still

felt like a start-up. The com-pany had just under 8 million

accounts, and the number was growing quickly as existing mem-

bers invited their friends and col-leagues to join. But users weren’t

seeking out connections with the people who were already on the site at the rate executives had expected. Something was apparently miss-ing in the social experience. As one LinkedIn manager put it, “It was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.”

70 !Harvard Business Review!October 2012

SPOTLIGHT ON BIG DATA

≠FROM!(!!!FROM!pv_users!!!MAP!pv_users.userid,!pv_users.date!!!USING!'map_script'!!!AS!dt,!uid!!!CLUSTER0BY0dt)!map_output!INSERT0OVERWRITE0TABLE0pv_users_reduced!!!REDUCE!map_output.dt,!map_output.uid!!!USING!'reduce_script'!!!AS!date,!count;!

A"="load"'WordcountInput.txt';"B"="MAPREDUCE"wordcount.jar"store"A"into"'inputDir‘"load"""""'outputDir'"as"(word:chararray,"count:"int)"""""'org.myorg.WordCount"inputDir"outputDir';"C"="sort"B"by"count;"

10Taken from http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf

http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf

http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf

11

Hadoop is...1. A programming model called MapReduce

2. An implementation of said programming model, called Hadoop MapReduce

3. A file system, called HDFS

4. A resource manager, called Yarn

5. Interfaces to Hadoop MapReduce (Pig, Hive, Cascading, ...)

7. Recently, a collection of runtime systems (Tez, Impala, Spark, Stratosphere, ...)

&XUUHQW�6WDWH�RI�%LJ�'DWD

Ɣ%LJ�'

DWD�FRPP

RQO\�GHILQHG�E\�WKH��9¶V

Ɣ:LGHVSUHDG�DGRSWLRQ�RI�$SDFKH�+

DGRRS�DV�

WKH�IRXQGDWLRQ

ż0DQ\�KLJK�OHYHO�SURJUDPPLQJ�PRGHOV�VXFK�DV�3LJ��

+LYH��&DVFDGLQJ

Ɣ1HZ�V\VWHPV�RQ�WKH�KRUL]RQ��,PSDOD��7H]��

6SDUN��6WUDWRVSKHUH

6. An ML library called Mahout.

* Inspired byJens Dittrich

12

1. A programming model called MapReduce

val!input!=!TextFile(textInput)

val!words!=!input.flatMap!{!line!=>!line.split(“!“)!}

val!counts!=!words.groupBy!{!word!=>!word!}.count()!

val!output!=!counts.write6(wordsOutput,!CsvOutputFormat())

“Romeo, Romeo, wherefore art thou Romeo?”

map( ) = [ (Romeo,1), (Romeo,1)(wherefore,1), (art,1)(thou,1), (Romeo,1) ]

reduce( (Romeo,(1,1,1))(wherefore,1), (art,1)(thou,1) ) = [ (Romeo,3)

(wherefore,1), (art,1)(thou,1) ]

13

2. An implementation of said programming model, called Hadoop MapReduce

Map

“Romeo, Romeo, wherefore art thou Romeo?”

“What, art thou hurt?”

(Romeo, 1)(Romeo, 1) (wherefore, 1)(art, 1) (thou, 1) (Romeo, 1)

(What, 1) (art, 1) (thou, 1)(hurt, 1)

Map

Redu

ceRe

duce

(Romeo, (1,1,1)) (art, (1,1))(thou, (1,1))

(wherefore, 1)(What, 1) (hurt, 1)

(Romeo, 3) (art, 2)(thou, 2)

(wherefore, 1)(What, 1) (hurt, 1)

Data shuffled over network

Data written to disk

14

public!class!ReduceSideBookAndAuthorJoin!extends!HadoopJob!{

!!private!static!final!Pattern!SEPARATOR!=!Pattern.compile("\t");

!!@Override

!!public!int!run(String[]!args)!throws!Exception!{

!!!!Map<String,String>!parsedArgs!=!parseArgs(args);

!!!!Path!authors!=!new!Path(parsedArgs.get("OOauthors"));!!!!Path!books!=!new!Path(parsedArgs.get("OObooks"));!!!!Path!outputPath!=!new!Path(parsedArgs.get("OOoutput"));

!!!!Job!join!=!new!Job(new!Configuration(getConf()));!!!!Configuration!jobConf!=!join.getConfiguration();

!!!!MultipleInputs.addInputPath(join,!authors,!TextInputFormat.class,!ConvertAuthorsMapper.class);!!!!MultipleInputs.addInputPath(join,!books,!TextInputFormat.class,!ConvertBooksMapper.class);

!!!!join.setMapOutputKeyClass(SecondarySortedAuthorID.class);!!!!join.setMapOutputValueClass(AuthorOrTitleAndYearOfPublication.class);!!!!jobConf.setBoolean("mapred.compress.map.output",!true);

!!!!join.setReducerClass(JoinReducer.class);!!!!join.setOutputKeyClass(Text.class);!!!!join.setOutputValueClass(NullWritable.class);!!!!join.setJarByClass(JoinReducer.class);!!!!join.setJobName("reduceSideBookAuthorJoin");

!!!!join.setOutputFormatClass(TextOutputFormat.class);!!!!jobConf.set("mapred.output.dir",!outputPath.toString());

!!!!join.setGroupingComparatorClass(SecondarySortedAuthorID.GroupingComparator.class);!!!!join.waitForCompletion(true);

!!!!return!0;!!}

!!static!class!ConvertAuthorsMapper!!!!!!extends!Mapper<Object,Text,SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication>!{!!!!@Override

!!!!protected!void!map(Object!key,!Text!value,!Context!ctx)!throws!IOException,!InterruptedException!{!!!!!!String!line!=!value.toString();!!!!!!if!(line.length()!>!0)!{!!!!!!!!String[]!tokens!=!SEPARATOR.split(line.toString());!!!!!!!!long!authorID!=!Long.parseLong(tokens[0]);!!!!!!!!String!author!=!tokens[1];!!!!!!!!ctx.write(new!SecondarySortedAuthorID(authorID,!true),!new!AuthorOrTitleAndYearOfPublication(author));!!!!!!}!!!!}!!}

!!static!class!ConvertBooksMapper!!!!!!extends!Mapper<Object,Text,SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication>!{!!!!@Override

!!!!protected!void!map(Object!key,!Text!line,!Context!ctx)!throws!IOException,!InterruptedException!{!!!!!!String[]!tokens!=!SEPARATOR.split(line.toString());!!!!!!long!authorID!=!Long.parseLong(tokens[0]);!!!!!!short!yearOfPublication!=!Short.parseShort(tokens[1]);!!!!!!String!title!=!tokens[2];!!!!!!ctx.write(new!SecondarySortedAuthorID(authorID,!false),!new!AuthorOrTitleAndYearOfPublication(title,!!!!!!!!!!yearOfPublication));!!!!}!!}

!!static!class!JoinReducer!!!!!!extends!Reducer<SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication,Text,NullWritable>!{!!!!@Override

!!!!protected!void!reduce(SecondarySortedAuthorID!key,!Iterable<AuthorOrTitleAndYearOfPublication>!values,!Context!ctx)!!!!!!!!throws!IOException,!InterruptedException!{!!!!!!String!author!=!null;!!!!!!for!(AuthorOrTitleAndYearOfPublication!value!:!values)!{!!!!!!!!if!(author!==!null!&&!!value.containsAuthor())!{!!!!!!!!!!throw!new!IllegalStateException("No!author!found!for!book:!"!+!value.getTitle());!!!!!!!!}!else!if!(author!==!null!&&!value.containsAuthor())!{!!!!!!!!!!author!=!value.getAuthor();!!!!!!!!}!else!{!!!!!!!!!!ctx.write(new!Text(author!+!'\t'!+!value.getTitle()!+!'\t'!+!value.getYearOfPublication()),!!!!!!!!!!!!!!NullWritable.get());!!!!!!!!}!!!!!!}!!!!}!!}

!!static!class!SecondarySortedAuthorID!implements!WritableComparable<SecondarySortedAuthorID>!{

!!!!private!boolean!containsAuthor;!!!!private!long!id;

!!!!static!{!!!!!!WritableComparator.define(SecondarySortedAuthorID.class,!new!SecondarySortComparator());!!!!}

!!!!SecondarySortedAuthorID()!{}

!!!!SecondarySortedAuthorID(long!id,!boolean!containsAuthor)!{!!!!!!this.id!=!id;!!!!!!this.containsAuthor!=!containsAuthor;!!!!}

!!!!@Override

!!!!public!int!compareTo(SecondarySortedAuthorID!other)!{!!!!!!return!ComparisonChain.start()!!!!!!!!!!.compare(id,!other.id)!!!!!!!!!!.result();!!!!}

!!!!@Override

!!!!public!void!write(DataOutput!out)!throws!IOException!{!!!!!!out.writeBoolean(containsAuthor);!!!!!!out.writeLong(id);!!!!}

!!!!@Override

!!!!public!void!readFields(DataInput!in)!throws!IOException!{!!!!!!containsAuthor!=!in.readBoolean();!!!!!!id!=!in.readLong();!!!!}

!!!!@Override

!!!!public!boolean!equals(Object!o)!{!!!!!!if!(o!instanceof!SecondarySortedAuthorID)!{!!!!!!!!return!id!==!((SecondarySortedAuthorID)!o).id;!!!!!!}!!!!!!return!false;!!!!}

!!!!@Override

!!!!public!int!hashCode()!{!!!!!!return!Longs.hashCode(id);!!!!}

!!!!static!class!SecondarySortComparator!extends!WritableComparator!implements!Serializable!{

!!!!!!protected!SecondarySortComparator()!{!!!!!!!!super(SecondarySortedAuthorID.class,!true);!!!!!!}

!!!!!!@Override

!!!!!!public!int!compare(WritableComparable!a,!WritableComparable!b)!{!!!!!!!!SecondarySortedAuthorID!keyA!=!(SecondarySortedAuthorID)!a;!!!!!!!!SecondarySortedAuthorID!keyB!=!(SecondarySortedAuthorID)!b;

!!!!!!!!return!ComparisonChain.start()!!!!!!!!!!!!.compare(keyA.id,!keyB.id)!!!!!!!!!!!!.compare(!keyA.containsAuthor,!!keyB.containsAuthor)!!!!!!!!!!!!.result();!!!!!!}!!!!}

!!!!static!class!GroupingComparator!extends!WritableComparator!implements!Serializable!{

!!!!!!protected!GroupingComparator()!{!!!!!!!!super(SecondarySortedAuthorID.class,!true);!!!!!!}!!!!}

!!}

!!static!class!AuthorOrTitleAndYearOfPublication!implements!Writable!{

!!!!private!boolean!containsAuthor;!!!!private!String!author;!!!!private!String!title;!!!!private!Short!yearOfPublication;

!!!!AuthorOrTitleAndYearOfPublication()!{}

!!!!AuthorOrTitleAndYearOfPublication(String!author)!{!!!!!!this.containsAuthor!=!true;!!!!!!this.author!=!Preconditions.checkNotNull(author);!!!!}

!!!!AuthorOrTitleAndYearOfPublication(String!title,!short!yearOfPublication)!{!!!!!!this.containsAuthor!=!false;!!!!!!this.title!=!Preconditions.checkNotNull(title);!!!!!!this.yearOfPublication!=!yearOfPublication;!!!!}

!!!!public!boolean!containsAuthor()!{!!!!!!return!containsAuthor;!!!!}

!!!!public!String!getAuthor()!{!!!!!!return!author;!!!!}

!!!!public!String!getTitle()!{!!!!!!return!title;!!!!}

!!!!public!Short!getYearOfPublication()!{!!!!!!return!yearOfPublication;!!!!}

!!!!@Override

!!!!public!void!write(DataOutput!out)!throws!IOException!{!!!!!!out.writeBoolean(containsAuthor);!!!!!!if!(containsAuthor)!{!!!!!!!!out.writeUTF(author);!!!!!!}!else!{!!!!!!!!out.writeUTF(title);!!!!!!!!out.writeShort(yearOfPublication);!!!!!!}!!!!}

!!!!@Override

!!!!public!void!readFields(DataInput!in)!throws!IOException!{!!!!!!author!=!null;!!!!!!title!=!null;!!!!!!yearOfPublication!=!null;!!!!!!containsAuthor!=!in.readBoolean();!!!!!!if!(containsAuthor)!{!!!!!!!!author!=!in.readUTF();!!!!!!}!else!{!!!!!!!!title!=!in.readUTF();!!!!!!!!yearOfPublication!=!in.readShort();!!!!!!}!!!!}!!}

}

Hand-coded join in Hadoop MapReduce

15

5. Interfaces to Hadoop MapReduce (Pig, Hive, Cascading, ...)

Map

Redu

ce

Map

Redu

ce

Map

Redu

ce

� Lacking in declarativity

� Operators exchange data via HDFS� Sort the only grouping operator� Need many MapReduce rounds

16

6. An ML library called Mahout.Iterative programs in Hadoop

Client

Map

Redu

ce

Map

Redu

ce

Map

Redu

ce

Iteration 1 Iteration 2 Iteration 3

17

Stratosphere – Parallel Analytics Beyond MapReduce

■ Changes to the iteration's result for Connected Components in each superstep

Incremental Iterations matter

19

0

200

400

600

800

1000

1200

1400

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34

# Ve

rtic

es (t

hous

ands

)

Superstep Naïve (Bulk) Incremental

Iterations in MapReduce too slow. Design a new runtime system and use the Hadoop scheduler to exploit sparse computational dependencies.

18

Observations

4. MapReduce programming model not suited for SQL. Need to hack around it with multiple MapReduce rounds.

1. MapReduce programming model good for grouping & counting.

3. Hadoop implementation of MapReduce trades performance for fault-tolerance (disk-based data shuffling).

5. Hadoop’s implementation of MapReduce not suited for SQL.

6. MapReduce programming model and its Hadoop implementation not suited for iterations. Need to hack around it with implementing iterations in client or embedding a new runtime in a Map function.

2. MapReduce programming model not good for much else.

19

Stratosphere

Big Data

20

6WUDWRVSKHUH�+LVWRU\Ɣ $ZDUG�ZLQQLQJ��')*��'HXWVFKH�

)RUVFKXQJVJHVHOOVFKDIW��IXQGHG�UHVHDUFK�SURMHFW�E\�XQLYHUVLWLHV�IURP�WKH�%HUOLQ�DUHD

Ɣ 0RUH�WKDQ��SDSHUV�SXEOLVKHG�DW�ZRUOG�FODVV�FRQIHUHQFHV

Ɣ 7KH�RQO\�%LJ�'DWD�$QDO\WLFV�SODWIRUP�GHYHORSHG�LQ�(XURSH

Ɣ 7RGD\��&RPSOHWHO\�RSHQ�VRXUFH��FRPPXQLW\�GULYHQ�GHYHORSPHQW��IRFXV�RQ�VWDELOLW\�DQG�XVDELOLW\

Stratosphere: a brief history2009: DFG-funded research group from TUB, HUB, HPI starts research on “Information Management in the Cloud.”

2010-2012: Stratosphere released as open source (v0.1, v0.2) and becomes known in academic community. Companies and Universities in Europe become part of Stratosphere.

2013 and beyond: Transition from a research project to a stable and usable open source system, developer community, and real-world use cases.

21

Stratosphere status

Next stable release (v0.4) coming up around end of November. Snapshot available to download; maturity equivalent to Apache incubations.

Community picking up: external developers from Universities (KTH, SICS, Inria, and others), hackathons in Berlin, Paris, Budapest, companies are starting to use Stratosphere (Deutsche Telekom, Internet Memory, Mediaplus).

23

Desiderata for next-gen big data platforms: Usability

10 millionExcel users

3 millionR users

70,000Hadoop

users

“the market faces certain challenges

such as unavailability of qualified and

experienced work professionals, who can effectively handle the Hadoop architecture.”

24

Desiderata for next-gen big data platforms: Performance

0! 100! 200! 300! 400! 500! 600! 700!

Hadoop!

Stratosphere!

Performance difference from days to minutes enables real time decision making and widespread use of data within the organization.

25

(a) Complex Plan Diagram (b) Reduced Plan Diagram

Figure 2: Complex Plan and Reduced Plan Diagram (Query 8, OptA)

they are “doing too good a job”, not merited by thecoarseness of the underlying cost space. Moreover,if it were possible to simplify the optimizer to pro-duce only reduced plan diagrams, it is plausible thatthe considerable processing overheads typically asso-ciated with query optimization could be significantlylowered.

Complex Patterns: The plan diagrams exhibit a varietyof intricate tessellated patterns, including speckles,stripes, blinds, mosaics and bands, among others. Forexample, witness the rapidly alternating choices be-tween plans P12 (dark green) and P16 (light gray)in the bottom left quadrant of Figure 2(a). Further,the boundaries of the plan optimality regions can behighly irregular – a case in point is plan P8 (darkpink) in the top right quadrant of Figure 2(a). Thesecomplex patterns appear to indicate the presence ofstrongly non-linear and discretized cost models, againperhaps an over-kill in light of Figure 2(b).

Non-Monotonic Cost Behavior: We have found quite afew instances where, although the base relation selec-tivities and the result cardinalities are monotonicallyincreasing, the cost diagram does not show a corre-sponding monotonic behavior.5 Sometimes, the non-monotonic behavior arises due to a change in plan,perhaps understandable given the restricted searchspace evaluated by the optimizer. But, more surpris-ingly, we have also encountered situations where aplan shows such behavior even internal to its optimal-ity region.

5Our query setup is such that in addition to the result cardinality mono-tonically increasing as we travel outwards along the selectivity axes, theresult tuples are also supersets of the previous results.

Validity of PQO: A rich body of literature exists on para-metric query optimization (PQO) [1, 2, 7, 8, 3, 4, 10,11, 12]. The goal here is to apriori identify the optimalset of plans for the entire relational selectivity spaceat compile time, and subsequently to use at run timethe actual selectivity parameter settings to identify thebest plan – the expectation is that this would be muchfaster than optimizing the query from scratch. Muchof this work is based on a set of assumptions, that wedo not find to hold true, even approximately, in theplan diagrams produced by the commercial optimiz-ers.

For example, one of the assumptions is that a plan isoptimal within the entire region enclosed by its planboundaries. But, in Figure 2(a), this is violated by thesmall (brown) rectangle of plan P14, close to coordi-nates (60,30), in the (light-pink) optimality region ofplan P3, and there are several other such instances.

On the positive side, however, we show that someof the important PQO assumptions do hold approxi-mately for reduced plan diagrams.

1.1 Organization

The above effects are described in more detail in the re-mainder of this paper, which is organized as follows: InSection 2, we present the Picasso tool and the testbed en-vironment. Then, in Section 3, the skew in the plan spacedistribution, as well as techniques for reducing the plan setcardinalities, are discussed. The relationship to PQO is ex-plored in Section 4. Interesting plan diagram motifs arepresented in Section 5. An overview of related work is pro-vided in Section 6. Finally, in Section 7, we summarize

Data characteristics change

Dat

a ch

arac

teris

tics

chan

ge

Each color is a differently writtenprogram that produces the same result but has very different performance depending on small changes in the data set and the analysis requirements

Query optimizers: the enabling technology for SQL data warehousing and BI

Successful industrial application of artificial intelligence

Currently, only Stratosphere can optimize non-relational data analysis programs.

26

Peeking into the Optimization of Data Flow Programs with MapReduce-style UDFs

StratoSphereAbove the Clouds

Fabian Hueske, Mathias Peters, Aljoscha Krettek, Matthias Ringwald,Kostas Tzoumas, Volker Markl, Johann-Christoph Freytag

Stratosphere is a joint project by TU Berlin, HU Berlin, and HPI Potsdam and funded as DFG research unit FOR1306 with additional support from HP and IBM.

Stratospherehttp://www.stratosphere.eu08.04.2013

0 1 2 3 4 5 6 7 8

CREATE VIEW revenue (supplier_no, total_revenue) AS VIEW

SELECT l_suppkey, SUM(l_extendedprice * (1 - l_discount)) FROM lineitem FROM

WHERE

l_shipdate >= 'DATE' AND l_shipdate < DATE 'DATE' + INTERVAL '3' MONTH GROUP BY l_suppkey; SELECT s_suppkey, s_name, s_address, s_phone, total_revenue FROM supplier, revenue FROM

WHERE s_suppkey = supplier_no;

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 4 5 6 7 8

TPC-H Query 15 as PACT ProgramMotivation: Operator Reordering• Data !ow programming is a popular abstraction for complex analytics

• Diversity of data and tasks requires user-de"ned functions

• Operator order has signi"cant impact on execution performance

• Reordering UDF operators requires knowlegde of UDF properties

UDF Code Analysisaggregateproject

Read Set

Write Set Out-Card Bounds

Prerequisites:

• Static Code Analysis Framework provides

Control-Flow, Def-Use, Use-Def lists

• Fixed API to access records

Extracted Information:

• Field sets track read and write accesses on records

• Upper and lower output cardinality bounds

Safety:

• All record access instructions are detected

• Supersets of actual Read/Write sets are returned

• Supersets allow fewer but always safe transformations

Data Flow Transformations

Physical Optimization Parallel Execution

Execution Plan Selection:

• Chooses execution strategies for 2nd-order functions

• Chooses shipping strategies to distribute data

• Strategies known from parallel databases

Interesting Properties:

• Sorting, Grouping, Partitioning

• Property preservation reasoning with write sets

Cost-based Plan Selection:

• Exploits UDF annotations for size estimates

• Cost model combines network, disk I/O and CPU costs

Reorder Conditions:

1. No Write-Read / Write-Write con!icts on record "elds

• Similar to con!ict detection in optimistic concurrency control

2. Preservation of groups for grouping operators

• Groups must remain unchanged or be completely removed

Enumeration Algorithm:

• Descents data !ow recursively top-down

• Checks reorder conditions and switches successive operators

Supported Transformations:

• Filter push-down

• Join reordering

• Invariant group transformations

• Non-relational operators are integrated

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

filter

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8Details in [HPS+12] and [HKT12]

Details in [HPS+12]

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

Local Forward

Local Forward

Local Forward

Partition

Local Forward

Partition

Local Forward

Details in [BEH+10] Details in [BEH+10] and [WK09]

Execution Engine:

• Massively parallel execution of

DAG-structured data !ows

• Sequential processing tasks

• Synchronous communication

(In-memory and network)

Runtime Operators:

• Implemented as sequential

processing tasks

• Call UDFs

Node 1 Node 2 Node 3

0 1 2 3 44 5 6 7 8

2nd-order function

...

Key Part Value Part

MAP REDUCE CROSS MATCH COGROUP

UDF1st-order function

...

Input Data Output DataIndependentData Subsets

Context: Pact Programming Model

Pact Operator

5678

l_suppkey

l_shipdatel_discountl_extendedprice

0123

s_suppkeys_names_addresss_phone

05

5

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

05

5

5

5 0

5

5 0

[WK09] Warneke, Kao, "Nephele: E#cient Parallel Data Processing in the Cloud", MTAGS '09

[BEH+10] Battré, Ewen, Hueske, Kao, Markl, Warneke, "Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing", SOCC '10

[HPS+12] Hueske, Peters, Sax, Rheinländer, Bergmann, Krettek, Tzoumas, "Opening the Black Boxes in Data Flow Optimization", PVLDB 5(11) '12

[HKT12] Hueske, Krettek, Tzoumas, "Enabling Operator Reordering in Data Flow Programs Through Static Code Analysis", XLDI Workshop '12

r0 = thisr1 = @parameter0 // Record/ Reco/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll/ Coll

$r5 = r1.getField(8)8)$r6 = r0.date_lb$i0 = $r5.compareTo($r6)o($r6)if $i0 < 0 goto 1

$r9 = r1.getField(8)8)$r10 = r0.date_ub$i1 = $r9.compareTo($r10)if $i1 >= 0 goto 1

r2.collect(r1)

1: return

r0 = thisr1 = @parameter0 // Record/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll

$d0 = $d0 = r1.getField(6)r0.extendedprice = $d0 $d0$d1 = $d1 = r1.getField(7)r0.discount = $d1

$r7 = r0.revenue // PactRecord$d2 = r0.extendedprice$d3 = r0.discount$d4 = 0 - $d3$d5 = $d2 * $d4$r7.setValue($d5)

r1.setNull(6))r1.setNull(7)r1.setNull(8)

$r8 = r0.revenuer1.setField(4, $r8)r2. 1) collect(r1)

r0 = thisr1 = @parameter0 // Iterator/ Iter/ Iterter0 /r2 = @parameter1 // Collector/ Coll/ Collter1 /

r3 = r1.next()d0 = rd0 = r3.getField(4)))

goto 2

1: r3 = r1.next()$d1 = $d1 = r3.getField(4)d0 = d0 + $d1

2: $z0 = r1.hasNext()if $z0 != 0 goto 1

r3.setField(4, d0)4, d0)

r2.collect(r3)

[0,1][0,1] [0,1][0,1] [0,1][0,1]et Ou






0 1 2 3 4 5 6 7 8



WHERE



REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 4 5 6 7 8






Read Set


Prerequisites:







Safety:
















Reorder Conditions:










• Join reordering



REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

filter

0 1 2 3 4 5 6 7 8


Details in [HPS+12]

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

Local Forward

Local Forward

Local Forward

Partition

Local Forward

Partition

Local Forward


Execution Engine:






Runtime Operators:


processing tasks

• Call UDFs


0 1 2 3 44 5 6 7 8

2nd-order function

...

Key Part Value Part



...



Pact Operator

5678

l_suppkey


0123


05

5

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

05

5

5

5 0

5

5 0








r2.collect(r1)

1: return








goto 2




r2.collect(r3)

[0,1][0,1] [0,1][0,1] [0,1][0,1]et Ou






0 1 2 3 4 5 6 7 8



WHERE



REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 4 5 6 7 8






Read Set


Prerequisites:







Safety:
















Reorder Conditions:










• Join reordering



REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

filter

0 1 2 3 4 5 6 7 8


Details in [HPS+12]

REDUCESort

MAPPipeline

MAPPipeline

MATCHHybrid-Hash

COMBINEPart-Sort

lineitem

supplier

output

Local Forward

Local Forward

Local Forward

Partition

Local Forward

Partition

Local Forward


Execution Engine:






Runtime Operators:


processing tasks

• Call UDFs


0 1 2 3 44 5 6 7 8

2nd-order function

...

Key Part Value Part



...



Pact Operator

5678

l_suppkey


0123


05

5

REDUCEaggregate

lineitem

supplier

output

MAPfilter

MAPproject

MATCHjoin

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3 44 5 6 7 8

0 1 2 3 44 5 6 7 8

05

5

5

5 0

5

5 0








r2.collect(r1)

1: return








goto 2




r2.collect(r3)

[0,1][0,1] [0,1][0,1] [0,1][0,1]et Ou

Use a combination of compiler and database technology to lift optimization beyond relational algebra. Derive properties of user-defined functions via code analysis and use these to mimic a relational database optimizer.

27

MapReduce Impala, ... Stratosphere

Text ✔ ✔ ✔

Aggregation ✔ ✔ ✔

ETL ✔ ✔ ✔

SQL Hive is too slow

✔ ✔

Advanced analytics

Mahout is slow and low level

Madlib is too slow

✔

mapreduce

one passdataflow

many passdataflowA fast, massively parallel

database-inspired backend.

Truly scales to disk-resident large data sets using database technology (e.g., hybrid hashing and external sort-merge for implementing key matching).

Built-in support for iterative programs via “iterate” operator: predictive and advanced analytics (machine learning, graph processing, stats) are all iterative.

Giraph is a Stratosphere program

28Stratosphere – Parallel Analytics Beyond MapReduce

Incremental Iterations: Doing Pregel

Wi Si

Match

CoGroup (left outer) N

Wi+1 Di+1

U .

Working Set has messages sent by the vertices Delta set has state of changed vertices

Aggregate messages and derive new state

Graph Topology

Create Messages from new state

29

To recap:Stratosphere is an open-source system that runs on top of Hadoop Yarn and HDFS, but replaces Hadoop MapReduce with a new runtime engine designed for iterative and DAG-shaped programs, offers a program optimizer that frees programmer from low-level decisions, is scalable to large clusters and disk-resident data sets, and is programmable in Java and Scala (and more to come).

30

A next-generation Big Data platform is being developed in Berlin.

Help us shape the future of Stratosphere!

http://www.flickr.com/photos/andiearbeit/4354455624/lightbox/



dr. kostas tzoumas: big data looks tiny from stratosphere at big data beers (nov. 20, 2013)

Technology

data analysis

big data

sensor data

data vortex

data scientist

unstructured data

csvoutputformatmapreduce

different data models