dr. kostas tzoumas: big data looks tiny from stratosphere at big data beers (nov. 20, 2013)
DESCRIPTION
Presentation by Dr. Kostas Tzoumas at the Big Data Beers Meetup [1] (Nov. 20, 2013) introducing the Stratosphere Platform for Big Data Analytics. Check out http://stratosphere.eu for more information. [1] http://www.meetup.com/Big-Data-Beers/events/147397982/TRANSCRIPT
Big Data looks tiny from
StratosphereKostas Tzoumas
Data is an important assetvideo & audio streams, sensor data, RFID, GPS, user online
behavior, scientific simulations, web archives, ...
VolumeHandle petabytes of data
VelocityHandle high data arrival rates
VarietyHandle many heterogeneous data sources
VeracityHandle inherent uncertainty of data2
3
Data
Analysis
Four “I”s for Big Analysistext mining, interactive and ad hoc analysis, machine
learning, graph analysis, statistical algorithms
IterativeModel the data, do not just describe it
IncrementalMaintain the model under high arrival rates
InteractiveStep-by-step data exploration on very large data
IntegrativeFluent unified interfaces for different data models4
5
HadoopHadoop’s selling point is its low effective storage cost.
Hadoop clusters are becoming a data vortex, attracting cross-departmental data and changing the data usage culture in companies.
Hadoop MapReduce was the wrong abstraction and implementation to begin with and will be superseded by better systems.
6
Advanced AnalyticsAnalytics that model the data to reveal hidden relationships, not just describe the data.
E.g., machine learning, predictive stats, graph analysis
Increasingly important from a market perspective.
Very different than SQL analytics: different languages and access patterns (iterative vs. one-pass programs).
Hadoop toolchain poor; R, Matlab, etc not parallel.
MapReduce
NoMapReduce
SQL
BigSQL
BigAnalytics
8
scriptingSQL--
columnstore++
scalable parallel sort
a queryplan
XQuery? wrongplatform
9
Data Scientist: The Sexiest Job of the 21st Century
Meet the people who can coax treasure out of messy, unstructured data. by Thomas H. Davenport and D.J. Patil
ARTWORK Tamar Cohen, Andrew J Buboltz 2011, silk screen on a page from a high school yearbook, 8.5" x 12"
Spotlight
hen Jonathan Goldman ar-rived for work in June 2006
at LinkedIn, the business networking site, the place still
felt like a start-up. The com-pany had just under 8 million
accounts, and the number was growing quickly as existing mem-
bers invited their friends and col-leagues to join. But users weren’t
seeking out connections with the people who were already on the site at the rate executives had expected. Something was apparently miss-ing in the social experience. As one LinkedIn manager put it, “It was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.”
70 !Harvard Business Review!October 2012
SPOTLIGHT ON BIG DATA
≠FROM!(!!!FROM!pv_users!!!MAP!pv_users.userid,!pv_users.date!!!USING!'map_script'!!!AS!dt,!uid!!!CLUSTER0BY0dt)!map_output!INSERT0OVERWRITE0TABLE0pv_users_reduced!!!REDUCE!map_output.dt,!map_output.uid!!!USING!'reduce_script'!!!AS!date,!count;!
A"="load"'WordcountInput.txt';"B"="MAPREDUCE"wordcount.jar"store"A"into"'inputDir‘"load"""""'outputDir'"as"(word:chararray,"count:"int)"""""'org.myorg.WordCount"inputDir"outputDir';"C"="sort"B"by"count;"
10Taken from http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf
11
Hadoop is...1. A programming model called MapReduce
2. An implementation of said programming model, called Hadoop MapReduce
3. A file system, called HDFS
4. A resource manager, called Yarn
5. Interfaces to Hadoop MapReduce (Pig, Hive, Cascading, ...)
7. Recently, a collection of runtime systems (Tez, Impala, Spark, Stratosphere, ...)
&XUUHQW�6WDWH�RI�%LJ�'DWD
Ɣ%LJ�'
DWD�FRPP
RQO\�GHILQHG�E\�WKH���9¶V
Ɣ:LGHVSUHDG�DGRSWLRQ�RI�$SDFKH�+
DGRRS�DV�
WKH�IRXQGDWLRQ
ż0DQ\�KLJK�OHYHO�SURJUDPPLQJ�PRGHOV�VXFK�DV�3LJ��
+LYH��&DVFDGLQJ
Ɣ1HZ�V\VWHPV�RQ�WKH�KRUL]RQ��,PSDOD��7H]��
6SDUN��6WUDWRVSKHUH
6. An ML library called Mahout.
* Inspired byJens Dittrich
12
1. A programming model called MapReduce
val!input!=!TextFile(textInput)
val!words!=!input.flatMap!{!line!=>!line.split(“!“)!}
val!counts!=!words.groupBy!{!word!=>!word!}.count()!
val!output!=!counts.write6(wordsOutput,!CsvOutputFormat())
“Romeo, Romeo, wherefore art thou Romeo?”
map( ) = [ (Romeo,1), (Romeo,1)(wherefore,1), (art,1)(thou,1), (Romeo,1) ]
reduce( (Romeo,(1,1,1))(wherefore,1), (art,1)(thou,1) ) = [ (Romeo,3)
(wherefore,1), (art,1)(thou,1) ]
13
2. An implementation of said programming model, called Hadoop MapReduce
Map
“Romeo, Romeo, wherefore art thou Romeo?”
“What, art thou hurt?”
(Romeo, 1)(Romeo, 1) (wherefore, 1)(art, 1) (thou, 1) (Romeo, 1)
(What, 1) (art, 1) (thou, 1)(hurt, 1)
Map
Redu
ceRe
duce
(Romeo, (1,1,1)) (art, (1,1))(thou, (1,1))
(wherefore, 1)(What, 1) (hurt, 1)
(Romeo, 3) (art, 2)(thou, 2)
(wherefore, 1)(What, 1) (hurt, 1)
Data shuffled over network
Data written to disk
14
public!class!ReduceSideBookAndAuthorJoin!extends!HadoopJob!{
!!private!static!final!Pattern!SEPARATOR!=!Pattern.compile("\t");
!!@Override
!!public!int!run(String[]!args)!throws!Exception!{
!!!!Map<String,String>!parsedArgs!=!parseArgs(args);
!!!!Path!authors!=!new!Path(parsedArgs.get("OOauthors"));!!!!Path!books!=!new!Path(parsedArgs.get("OObooks"));!!!!Path!outputPath!=!new!Path(parsedArgs.get("OOoutput"));
!!!!Job!join!=!new!Job(new!Configuration(getConf()));!!!!Configuration!jobConf!=!join.getConfiguration();
!!!!MultipleInputs.addInputPath(join,!authors,!TextInputFormat.class,!ConvertAuthorsMapper.class);!!!!MultipleInputs.addInputPath(join,!books,!TextInputFormat.class,!ConvertBooksMapper.class);
!!!!join.setMapOutputKeyClass(SecondarySortedAuthorID.class);!!!!join.setMapOutputValueClass(AuthorOrTitleAndYearOfPublication.class);!!!!jobConf.setBoolean("mapred.compress.map.output",!true);
!!!!join.setReducerClass(JoinReducer.class);!!!!join.setOutputKeyClass(Text.class);!!!!join.setOutputValueClass(NullWritable.class);!!!!join.setJarByClass(JoinReducer.class);!!!!join.setJobName("reduceSideBookAuthorJoin");
!!!!join.setOutputFormatClass(TextOutputFormat.class);!!!!jobConf.set("mapred.output.dir",!outputPath.toString());
!!!!join.setGroupingComparatorClass(SecondarySortedAuthorID.GroupingComparator.class);!!!!join.waitForCompletion(true);
!!!!return!0;!!}
!!static!class!ConvertAuthorsMapper!!!!!!extends!Mapper<Object,Text,SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication>!{!!!!@Override
!!!!protected!void!map(Object!key,!Text!value,!Context!ctx)!throws!IOException,!InterruptedException!{!!!!!!String!line!=!value.toString();!!!!!!if!(line.length()!>!0)!{!!!!!!!!String[]!tokens!=!SEPARATOR.split(line.toString());!!!!!!!!long!authorID!=!Long.parseLong(tokens[0]);!!!!!!!!String!author!=!tokens[1];!!!!!!!!ctx.write(new!SecondarySortedAuthorID(authorID,!true),!new!AuthorOrTitleAndYearOfPublication(author));!!!!!!}!!!!}!!}
!!static!class!ConvertBooksMapper!!!!!!extends!Mapper<Object,Text,SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication>!{!!!!@Override
!!!!protected!void!map(Object!key,!Text!line,!Context!ctx)!throws!IOException,!InterruptedException!{!!!!!!String[]!tokens!=!SEPARATOR.split(line.toString());!!!!!!long!authorID!=!Long.parseLong(tokens[0]);!!!!!!short!yearOfPublication!=!Short.parseShort(tokens[1]);!!!!!!String!title!=!tokens[2];!!!!!!ctx.write(new!SecondarySortedAuthorID(authorID,!false),!new!AuthorOrTitleAndYearOfPublication(title,!!!!!!!!!!yearOfPublication));!!!!}!!}
!!static!class!JoinReducer!!!!!!extends!Reducer<SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication,Text,NullWritable>!{!!!!@Override
!!!!protected!void!reduce(SecondarySortedAuthorID!key,!Iterable<AuthorOrTitleAndYearOfPublication>!values,!Context!ctx)!!!!!!!!throws!IOException,!InterruptedException!{!!!!!!String!author!=!null;!!!!!!for!(AuthorOrTitleAndYearOfPublication!value!:!values)!{!!!!!!!!if!(author!==!null!&&!!value.containsAuthor())!{!!!!!!!!!!throw!new!IllegalStateException("No!author!found!for!book:!"!+!value.getTitle());!!!!!!!!}!else!if!(author!==!null!&&!value.containsAuthor())!{!!!!!!!!!!author!=!value.getAuthor();!!!!!!!!}!else!{!!!!!!!!!!ctx.write(new!Text(author!+!'\t'!+!value.getTitle()!+!'\t'!+!value.getYearOfPublication()),!!!!!!!!!!!!!!NullWritable.get());!!!!!!!!}!!!!!!}!!!!}!!}
!!static!class!SecondarySortedAuthorID!implements!WritableComparable<SecondarySortedAuthorID>!{
!!!!private!boolean!containsAuthor;!!!!private!long!id;
!!!!static!{!!!!!!WritableComparator.define(SecondarySortedAuthorID.class,!new!SecondarySortComparator());!!!!}
!!!!SecondarySortedAuthorID()!{}
!!!!SecondarySortedAuthorID(long!id,!boolean!containsAuthor)!{!!!!!!this.id!=!id;!!!!!!this.containsAuthor!=!containsAuthor;!!!!}
!!!!@Override
!!!!public!int!compareTo(SecondarySortedAuthorID!other)!{!!!!!!return!ComparisonChain.start()!!!!!!!!!!.compare(id,!other.id)!!!!!!!!!!.result();!!!!}
!!!!@Override
!!!!public!void!write(DataOutput!out)!throws!IOException!{!!!!!!out.writeBoolean(containsAuthor);!!!!!!out.writeLong(id);!!!!}
!!!!@Override
!!!!public!void!readFields(DataInput!in)!throws!IOException!{!!!!!!containsAuthor!=!in.readBoolean();!!!!!!id!=!in.readLong();!!!!}
!!!!@Override
!!!!public!boolean!equals(Object!o)!{!!!!!!if!(o!instanceof!SecondarySortedAuthorID)!{!!!!!!!!return!id!==!((SecondarySortedAuthorID)!o).id;!!!!!!}!!!!!!return!false;!!!!}
!!!!@Override
!!!!public!int!hashCode()!{!!!!!!return!Longs.hashCode(id);!!!!}
!!!!static!class!SecondarySortComparator!extends!WritableComparator!implements!Serializable!{
!!!!!!protected!SecondarySortComparator()!{!!!!!!!!super(SecondarySortedAuthorID.class,!true);!!!!!!}
!!!!!!@Override
!!!!!!public!int!compare(WritableComparable!a,!WritableComparable!b)!{!!!!!!!!SecondarySortedAuthorID!keyA!=!(SecondarySortedAuthorID)!a;!!!!!!!!SecondarySortedAuthorID!keyB!=!(SecondarySortedAuthorID)!b;
!!!!!!!!return!ComparisonChain.start()!!!!!!!!!!!!.compare(keyA.id,!keyB.id)!!!!!!!!!!!!.compare(!keyA.containsAuthor,!!keyB.containsAuthor)!!!!!!!!!!!!.result();!!!!!!}!!!!}
!!!!static!class!GroupingComparator!extends!WritableComparator!implements!Serializable!{
!!!!!!protected!GroupingComparator()!{!!!!!!!!super(SecondarySortedAuthorID.class,!true);!!!!!!}!!!!}
!!}
!!static!class!AuthorOrTitleAndYearOfPublication!implements!Writable!{
!!!!private!boolean!containsAuthor;!!!!private!String!author;!!!!private!String!title;!!!!private!Short!yearOfPublication;
!!!!AuthorOrTitleAndYearOfPublication()!{}
!!!!AuthorOrTitleAndYearOfPublication(String!author)!{!!!!!!this.containsAuthor!=!true;!!!!!!this.author!=!Preconditions.checkNotNull(author);!!!!}
!!!!AuthorOrTitleAndYearOfPublication(String!title,!short!yearOfPublication)!{!!!!!!this.containsAuthor!=!false;!!!!!!this.title!=!Preconditions.checkNotNull(title);!!!!!!this.yearOfPublication!=!yearOfPublication;!!!!}
!!!!public!boolean!containsAuthor()!{!!!!!!return!containsAuthor;!!!!}
!!!!public!String!getAuthor()!{!!!!!!return!author;!!!!}
!!!!public!String!getTitle()!{!!!!!!return!title;!!!!}
!!!!public!Short!getYearOfPublication()!{!!!!!!return!yearOfPublication;!!!!}
!!!!@Override
!!!!public!void!write(DataOutput!out)!throws!IOException!{!!!!!!out.writeBoolean(containsAuthor);!!!!!!if!(containsAuthor)!{!!!!!!!!out.writeUTF(author);!!!!!!}!else!{!!!!!!!!out.writeUTF(title);!!!!!!!!out.writeShort(yearOfPublication);!!!!!!}!!!!}
!!!!@Override
!!!!public!void!readFields(DataInput!in)!throws!IOException!{!!!!!!author!=!null;!!!!!!title!=!null;!!!!!!yearOfPublication!=!null;!!!!!!containsAuthor!=!in.readBoolean();!!!!!!if!(containsAuthor)!{!!!!!!!!author!=!in.readUTF();!!!!!!}!else!{!!!!!!!!title!=!in.readUTF();!!!!!!!!yearOfPublication!=!in.readShort();!!!!!!}!!!!}!!}
}
Hand-coded join in Hadoop MapReduce
15
5. Interfaces to Hadoop MapReduce (Pig, Hive, Cascading, ...)
Map
Redu
ce
Map
Redu
ce
Map
Redu
ce
� Lacking in declarativity
� Operators exchange data via HDFS� Sort the only grouping operator� Need many MapReduce rounds
16
6. An ML library called Mahout.Iterative programs in Hadoop
Client
Map
Redu
ce
Map
Redu
ce
Map
Redu
ce
Iteration 1 Iteration 2 Iteration 3
17
Stratosphere – Parallel Analytics Beyond MapReduce
■ Changes to the iteration's result for Connected Components in each superstep
Incremental Iterations matter
19
0
200
400
600
800
1000
1200
1400
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
# Ve
rtic
es (t
hous
ands
)
Superstep Naïve (Bulk) Incremental
Iterations in MapReduce too slow. Design a new runtime system and use the Hadoop scheduler to exploit sparse computational dependencies.
18
Observations
4. MapReduce programming model not suited for SQL. Need to hack around it with multiple MapReduce rounds.
1. MapReduce programming model good for grouping & counting.
3. Hadoop implementation of MapReduce trades performance for fault-tolerance (disk-based data shuffling).
5. Hadoop’s implementation of MapReduce not suited for SQL.
6. MapReduce programming model and its Hadoop implementation not suited for iterations. Need to hack around it with implementing iterations in client or embedding a new runtime in a Map function.
2. MapReduce programming model not good for much else.
19
Stratosphere
Big Data
20
6WUDWRVSKHUH�+LVWRU\Ɣ $ZDUG�ZLQQLQJ��')*��'HXWVFKH�
)RUVFKXQJVJHVHOOVFKDIW��IXQGHG�UHVHDUFK�SURMHFW�E\�XQLYHUVLWLHV�IURP�WKH�%HUOLQ�DUHD
Ɣ 0RUH�WKDQ����SDSHUV�SXEOLVKHG�DW�ZRUOG�FODVV�FRQIHUHQFHV
Ɣ 7KH�RQO\�%LJ�'DWD�$QDO\WLFV�SODWIRUP�GHYHORSHG�LQ�(XURSH
Ɣ 7RGD\��&RPSOHWHO\�RSHQ�VRXUFH��FRPPXQLW\�GULYHQ�GHYHORSPHQW��IRFXV�RQ�VWDELOLW\�DQG�XVDELOLW\
Stratosphere: a brief history2009: DFG-funded research group from TUB, HUB, HPI starts research on “Information Management in the Cloud.”
2010-2012: Stratosphere released as open source (v0.1, v0.2) and becomes known in academic community. Companies and Universities in Europe become part of Stratosphere.
2013 and beyond: Transition from a research project to a stable and usable open source system, developer community, and real-world use cases.
21
Stratosphere status
Next stable release (v0.4) coming up around end of November. Snapshot available to download; maturity equivalent to Apache incubations.
Community picking up: external developers from Universities (KTH, SICS, Inria, and others), hackathons in Berlin, Paris, Budapest, companies are starting to use Stratosphere (Deutsche Telekom, Internet Memory, Mediaplus).
22
23
Desiderata for next-gen big data platforms: Usability
10 millionExcel users
3 millionR users
70,000Hadoop
users
“the market faces certain challenges
such as unavailability of qualified and
experienced work professionals, who can effectively handle the Hadoop architecture.”
24
Desiderata for next-gen big data platforms: Performance
0! 100! 200! 300! 400! 500! 600! 700!
Hadoop!
Stratosphere!
Performance difference from days to minutes enables real time decision making and widespread use of data within the organization.
25
(a) Complex Plan Diagram (b) Reduced Plan Diagram
Figure 2: Complex Plan and Reduced Plan Diagram (Query 8, OptA)
they are “doing too good a job”, not merited by thecoarseness of the underlying cost space. Moreover,if it were possible to simplify the optimizer to pro-duce only reduced plan diagrams, it is plausible thatthe considerable processing overheads typically asso-ciated with query optimization could be significantlylowered.
Complex Patterns: The plan diagrams exhibit a varietyof intricate tessellated patterns, including speckles,stripes, blinds, mosaics and bands, among others. Forexample, witness the rapidly alternating choices be-tween plans P12 (dark green) and P16 (light gray)in the bottom left quadrant of Figure 2(a). Further,the boundaries of the plan optimality regions can behighly irregular – a case in point is plan P8 (darkpink) in the top right quadrant of Figure 2(a). Thesecomplex patterns appear to indicate the presence ofstrongly non-linear and discretized cost models, againperhaps an over-kill in light of Figure 2(b).
Non-Monotonic Cost Behavior: We have found quite afew instances where, although the base relation selec-tivities and the result cardinalities are monotonicallyincreasing, the cost diagram does not show a corre-sponding monotonic behavior.5 Sometimes, the non-monotonic behavior arises due to a change in plan,perhaps understandable given the restricted searchspace evaluated by the optimizer. But, more surpris-ingly, we have also encountered situations where aplan shows such behavior even internal to its optimal-ity region.
5Our query setup is such that in addition to the result cardinality mono-tonically increasing as we travel outwards along the selectivity axes, theresult tuples are also supersets of the previous results.
Validity of PQO: A rich body of literature exists on para-metric query optimization (PQO) [1, 2, 7, 8, 3, 4, 10,11, 12]. The goal here is to apriori identify the optimalset of plans for the entire relational selectivity spaceat compile time, and subsequently to use at run timethe actual selectivity parameter settings to identify thebest plan – the expectation is that this would be muchfaster than optimizing the query from scratch. Muchof this work is based on a set of assumptions, that wedo not find to hold true, even approximately, in theplan diagrams produced by the commercial optimiz-ers.
For example, one of the assumptions is that a plan isoptimal within the entire region enclosed by its planboundaries. But, in Figure 2(a), this is violated by thesmall (brown) rectangle of plan P14, close to coordi-nates (60,30), in the (light-pink) optimality region ofplan P3, and there are several other such instances.
On the positive side, however, we show that someof the important PQO assumptions do hold approxi-mately for reduced plan diagrams.
1.1 Organization
The above effects are described in more detail in the re-mainder of this paper, which is organized as follows: InSection 2, we present the Picasso tool and the testbed en-vironment. Then, in Section 3, the skew in the plan spacedistribution, as well as techniques for reducing the plan setcardinalities, are discussed. The relationship to PQO is ex-plored in Section 4. Interesting plan diagram motifs arepresented in Section 5. An overview of related work is pro-vided in Section 6. Finally, in Section 7, we summarize
Data characteristics change
Dat
a ch
arac
teris
tics
chan
ge
Each color is a differently writtenprogram that produces the same result but has very different performance depending on small changes in the data set and the analysis requirements
Query optimizers: the enabling technology for SQL data warehousing and BI
Successful industrial application of artificial intelligence
Currently, only Stratosphere can optimize non-relational data analysis programs.
26
Peeking into the Optimization of Data Flow Programs with MapReduce-style UDFs
StratoSphereAbove the Clouds
Fabian Hueske, Mathias Peters, Aljoscha Krettek, Matthias Ringwald,Kostas Tzoumas, Volker Markl, Johann-Christoph Freytag
Stratosphere is a joint project by TU Berlin, HU Berlin, and HPI Potsdam and funded as DFG research unit FOR1306 with additional support from HP and IBM.
Stratospherehttp://www.stratosphere.eu08.04.2013
0 1 2 3 4 5 6 7 8
CREATE VIEW revenue (supplier_no, total_revenue) AS VIEW
SELECT l_suppkey, SUM(l_extendedprice * (1 - l_discount)) FROM lineitem FROM
WHERE
l_shipdate >= 'DATE' AND l_shipdate < DATE 'DATE' + INTERVAL '3' MONTH GROUP BY l_suppkey; SELECT s_suppkey, s_name, s_address, s_phone, total_revenue FROM supplier, revenue FROM
WHERE s_suppkey = supplier_no;
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 4 5 6 7 8
TPC-H Query 15 as PACT ProgramMotivation: Operator Reordering• Data !ow programming is a popular abstraction for complex analytics
• Diversity of data and tasks requires user-de"ned functions
• Operator order has signi"cant impact on execution performance
• Reordering UDF operators requires knowlegde of UDF properties
UDF Code Analysisaggregateproject
Read Set
Write Set Out-Card Bounds
Prerequisites:
• Static Code Analysis Framework provides
Control-Flow, Def-Use, Use-Def lists
• Fixed API to access records
Extracted Information:
• Field sets track read and write accesses on records
• Upper and lower output cardinality bounds
Safety:
• All record access instructions are detected
• Supersets of actual Read/Write sets are returned
• Supersets allow fewer but always safe transformations
Data Flow Transformations
Physical Optimization Parallel Execution
Execution Plan Selection:
• Chooses execution strategies for 2nd-order functions
• Chooses shipping strategies to distribute data
• Strategies known from parallel databases
Interesting Properties:
• Sorting, Grouping, Partitioning
• Property preservation reasoning with write sets
Cost-based Plan Selection:
• Exploits UDF annotations for size estimates
• Cost model combines network, disk I/O and CPU costs
Reorder Conditions:
1. No Write-Read / Write-Write con!icts on record "elds
• Similar to con!ict detection in optimistic concurrency control
2. Preservation of groups for grouping operators
• Groups must remain unchanged or be completely removed
Enumeration Algorithm:
• Descents data !ow recursively top-down
• Checks reorder conditions and switches successive operators
Supported Transformations:
• Filter push-down
• Join reordering
• Invariant group transformations
• Non-relational operators are integrated
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
filter
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8Details in [HPS+12] and [HKT12]
Details in [HPS+12]
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
Local Forward
Local Forward
Local Forward
Partition
Local Forward
Partition
Local Forward
Details in [BEH+10] Details in [BEH+10] and [WK09]
Execution Engine:
• Massively parallel execution of
DAG-structured data !ows
• Sequential processing tasks
• Synchronous communication
(In-memory and network)
Runtime Operators:
• Implemented as sequential
processing tasks
• Call UDFs
Node 1 Node 2 Node 3
0 1 2 3 44 5 6 7 8
2nd-order function
...
Key Part Value Part
MAP REDUCE CROSS MATCH COGROUP
UDF1st-order function
...
Input Data Output DataIndependentData Subsets
Context: Pact Programming Model
Pact Operator
5678
l_suppkey
l_shipdatel_discountl_extendedprice
0123
s_suppkeys_names_addresss_phone
05
5
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
05
5
5
5 0
5
5 0
[WK09] Warneke, Kao, "Nephele: E#cient Parallel Data Processing in the Cloud", MTAGS '09
[BEH+10] Battré, Ewen, Hueske, Kao, Markl, Warneke, "Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing", SOCC '10
[HPS+12] Hueske, Peters, Sax, Rheinländer, Bergmann, Krettek, Tzoumas, "Opening the Black Boxes in Data Flow Optimization", PVLDB 5(11) '12
[HKT12] Hueske, Krettek, Tzoumas, "Enabling Operator Reordering in Data Flow Programs Through Static Code Analysis", XLDI Workshop '12
r0 = thisr1 = @parameter0 // Record/ Reco/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll/ Coll
$r5 = r1.getField(8)8)$r6 = r0.date_lb$i0 = $r5.compareTo($r6)o($r6)if $i0 < 0 goto 1
$r9 = r1.getField(8)8)$r10 = r0.date_ub$i1 = $r9.compareTo($r10)if $i1 >= 0 goto 1
r2.collect(r1)
1: return
r0 = thisr1 = @parameter0 // Record/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll
$d0 = $d0 = r1.getField(6)r0.extendedprice = $d0 $d0$d1 = $d1 = r1.getField(7)r0.discount = $d1
$r7 = r0.revenue // PactRecord$d2 = r0.extendedprice$d3 = r0.discount$d4 = 0 - $d3$d5 = $d2 * $d4$r7.setValue($d5)
r1.setNull(6))r1.setNull(7)r1.setNull(8)
$r8 = r0.revenuer1.setField(4, $r8)r2. 1) collect(r1)
r0 = thisr1 = @parameter0 // Iterator/ Iter/ Iterter0 /r2 = @parameter1 // Collector/ Coll/ Collter1 /
r3 = r1.next()d0 = rd0 = r3.getField(4)))
goto 2
1: r3 = r1.next()$d1 = $d1 = r3.getField(4)d0 = d0 + $d1
2: $z0 = r1.hasNext()if $z0 != 0 goto 1
r3.setField(4, d0)4, d0)
r2.collect(r3)
[0,1][0,1] [0,1][0,1] [0,1][0,1]et Ou
Peeking into the Optimization of Data Flow Programs with MapReduce-style UDFs
StratoSphereAbove the Clouds
Fabian Hueske, Mathias Peters, Aljoscha Krettek, Matthias Ringwald,Kostas Tzoumas, Volker Markl, Johann-Christoph Freytag
Stratosphere is a joint project by TU Berlin, HU Berlin, and HPI Potsdam and funded as DFG research unit FOR1306 with additional support from HP and IBM.
Stratospherehttp://www.stratosphere.eu08.04.2013
0 1 2 3 4 5 6 7 8
CREATE VIEW revenue (supplier_no, total_revenue) AS VIEW
SELECT l_suppkey, SUM(l_extendedprice * (1 - l_discount)) FROM lineitem FROM
WHERE
l_shipdate >= 'DATE' AND l_shipdate < DATE 'DATE' + INTERVAL '3' MONTH GROUP BY l_suppkey; SELECT s_suppkey, s_name, s_address, s_phone, total_revenue FROM supplier, revenue FROM
WHERE s_suppkey = supplier_no;
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 4 5 6 7 8
TPC-H Query 15 as PACT ProgramMotivation: Operator Reordering• Data !ow programming is a popular abstraction for complex analytics
• Diversity of data and tasks requires user-de"ned functions
• Operator order has signi"cant impact on execution performance
• Reordering UDF operators requires knowlegde of UDF properties
UDF Code Analysisaggregateproject
Read Set
Write Set Out-Card Bounds
Prerequisites:
• Static Code Analysis Framework provides
Control-Flow, Def-Use, Use-Def lists
• Fixed API to access records
Extracted Information:
• Field sets track read and write accesses on records
• Upper and lower output cardinality bounds
Safety:
• All record access instructions are detected
• Supersets of actual Read/Write sets are returned
• Supersets allow fewer but always safe transformations
Data Flow Transformations
Physical Optimization Parallel Execution
Execution Plan Selection:
• Chooses execution strategies for 2nd-order functions
• Chooses shipping strategies to distribute data
• Strategies known from parallel databases
Interesting Properties:
• Sorting, Grouping, Partitioning
• Property preservation reasoning with write sets
Cost-based Plan Selection:
• Exploits UDF annotations for size estimates
• Cost model combines network, disk I/O and CPU costs
Reorder Conditions:
1. No Write-Read / Write-Write con!icts on record "elds
• Similar to con!ict detection in optimistic concurrency control
2. Preservation of groups for grouping operators
• Groups must remain unchanged or be completely removed
Enumeration Algorithm:
• Descents data !ow recursively top-down
• Checks reorder conditions and switches successive operators
Supported Transformations:
• Filter push-down
• Join reordering
• Invariant group transformations
• Non-relational operators are integrated
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
filter
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8Details in [HPS+12] and [HKT12]
Details in [HPS+12]
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
Local Forward
Local Forward
Local Forward
Partition
Local Forward
Partition
Local Forward
Details in [BEH+10] Details in [BEH+10] and [WK09]
Execution Engine:
• Massively parallel execution of
DAG-structured data !ows
• Sequential processing tasks
• Synchronous communication
(In-memory and network)
Runtime Operators:
• Implemented as sequential
processing tasks
• Call UDFs
Node 1 Node 2 Node 3
0 1 2 3 44 5 6 7 8
2nd-order function
...
Key Part Value Part
MAP REDUCE CROSS MATCH COGROUP
UDF1st-order function
...
Input Data Output DataIndependentData Subsets
Context: Pact Programming Model
Pact Operator
5678
l_suppkey
l_shipdatel_discountl_extendedprice
0123
s_suppkeys_names_addresss_phone
05
5
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
05
5
5
5 0
5
5 0
[WK09] Warneke, Kao, "Nephele: E#cient Parallel Data Processing in the Cloud", MTAGS '09
[BEH+10] Battré, Ewen, Hueske, Kao, Markl, Warneke, "Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing", SOCC '10
[HPS+12] Hueske, Peters, Sax, Rheinländer, Bergmann, Krettek, Tzoumas, "Opening the Black Boxes in Data Flow Optimization", PVLDB 5(11) '12
[HKT12] Hueske, Krettek, Tzoumas, "Enabling Operator Reordering in Data Flow Programs Through Static Code Analysis", XLDI Workshop '12
r0 = thisr1 = @parameter0 // Record/ Reco/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll/ Coll
$r5 = r1.getField(8)8)$r6 = r0.date_lb$i0 = $r5.compareTo($r6)o($r6)if $i0 < 0 goto 1
$r9 = r1.getField(8)8)$r10 = r0.date_ub$i1 = $r9.compareTo($r10)if $i1 >= 0 goto 1
r2.collect(r1)
1: return
r0 = thisr1 = @parameter0 // Record/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll
$d0 = $d0 = r1.getField(6)r0.extendedprice = $d0 $d0$d1 = $d1 = r1.getField(7)r0.discount = $d1
$r7 = r0.revenue // PactRecord$d2 = r0.extendedprice$d3 = r0.discount$d4 = 0 - $d3$d5 = $d2 * $d4$r7.setValue($d5)
r1.setNull(6))r1.setNull(7)r1.setNull(8)
$r8 = r0.revenuer1.setField(4, $r8)r2. 1) collect(r1)
r0 = thisr1 = @parameter0 // Iterator/ Iter/ Iterter0 /r2 = @parameter1 // Collector/ Coll/ Collter1 /
r3 = r1.next()d0 = rd0 = r3.getField(4)))
goto 2
1: r3 = r1.next()$d1 = $d1 = r3.getField(4)d0 = d0 + $d1
2: $z0 = r1.hasNext()if $z0 != 0 goto 1
r3.setField(4, d0)4, d0)
r2.collect(r3)
[0,1][0,1] [0,1][0,1] [0,1][0,1]et Ou
Peeking into the Optimization of Data Flow Programs with MapReduce-style UDFs
StratoSphereAbove the Clouds
Fabian Hueske, Mathias Peters, Aljoscha Krettek, Matthias Ringwald,Kostas Tzoumas, Volker Markl, Johann-Christoph Freytag
Stratosphere is a joint project by TU Berlin, HU Berlin, and HPI Potsdam and funded as DFG research unit FOR1306 with additional support from HP and IBM.
Stratospherehttp://www.stratosphere.eu08.04.2013
0 1 2 3 4 5 6 7 8
CREATE VIEW revenue (supplier_no, total_revenue) AS VIEW
SELECT l_suppkey, SUM(l_extendedprice * (1 - l_discount)) FROM lineitem FROM
WHERE
l_shipdate >= 'DATE' AND l_shipdate < DATE 'DATE' + INTERVAL '3' MONTH GROUP BY l_suppkey; SELECT s_suppkey, s_name, s_address, s_phone, total_revenue FROM supplier, revenue FROM
WHERE s_suppkey = supplier_no;
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 4 5 6 7 8
TPC-H Query 15 as PACT ProgramMotivation: Operator Reordering• Data !ow programming is a popular abstraction for complex analytics
• Diversity of data and tasks requires user-de"ned functions
• Operator order has signi"cant impact on execution performance
• Reordering UDF operators requires knowlegde of UDF properties
UDF Code Analysisaggregateproject
Read Set
Write Set Out-Card Bounds
Prerequisites:
• Static Code Analysis Framework provides
Control-Flow, Def-Use, Use-Def lists
• Fixed API to access records
Extracted Information:
• Field sets track read and write accesses on records
• Upper and lower output cardinality bounds
Safety:
• All record access instructions are detected
• Supersets of actual Read/Write sets are returned
• Supersets allow fewer but always safe transformations
Data Flow Transformations
Physical Optimization Parallel Execution
Execution Plan Selection:
• Chooses execution strategies for 2nd-order functions
• Chooses shipping strategies to distribute data
• Strategies known from parallel databases
Interesting Properties:
• Sorting, Grouping, Partitioning
• Property preservation reasoning with write sets
Cost-based Plan Selection:
• Exploits UDF annotations for size estimates
• Cost model combines network, disk I/O and CPU costs
Reorder Conditions:
1. No Write-Read / Write-Write con!icts on record "elds
• Similar to con!ict detection in optimistic concurrency control
2. Preservation of groups for grouping operators
• Groups must remain unchanged or be completely removed
Enumeration Algorithm:
• Descents data !ow recursively top-down
• Checks reorder conditions and switches successive operators
Supported Transformations:
• Filter push-down
• Join reordering
• Invariant group transformations
• Non-relational operators are integrated
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
filter
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8Details in [HPS+12] and [HKT12]
Details in [HPS+12]
REDUCESort
MAPPipeline
MAPPipeline
MATCHHybrid-Hash
COMBINEPart-Sort
lineitem
supplier
output
Local Forward
Local Forward
Local Forward
Partition
Local Forward
Partition
Local Forward
Details in [BEH+10] Details in [BEH+10] and [WK09]
Execution Engine:
• Massively parallel execution of
DAG-structured data !ows
• Sequential processing tasks
• Synchronous communication
(In-memory and network)
Runtime Operators:
• Implemented as sequential
processing tasks
• Call UDFs
Node 1 Node 2 Node 3
0 1 2 3 44 5 6 7 8
2nd-order function
...
Key Part Value Part
MAP REDUCE CROSS MATCH COGROUP
UDF1st-order function
...
Input Data Output DataIndependentData Subsets
Context: Pact Programming Model
Pact Operator
5678
l_suppkey
l_shipdatel_discountl_extendedprice
0123
s_suppkeys_names_addresss_phone
05
5
REDUCEaggregate
lineitem
supplier
output
MAPfilter
MAPproject
MATCHjoin
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 80 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 44 5 6 7 8
0 1 2 3 44 5 6 7 8
05
5
5
5 0
5
5 0
[WK09] Warneke, Kao, "Nephele: E#cient Parallel Data Processing in the Cloud", MTAGS '09
[BEH+10] Battré, Ewen, Hueske, Kao, Markl, Warneke, "Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing", SOCC '10
[HPS+12] Hueske, Peters, Sax, Rheinländer, Bergmann, Krettek, Tzoumas, "Opening the Black Boxes in Data Flow Optimization", PVLDB 5(11) '12
[HKT12] Hueske, Krettek, Tzoumas, "Enabling Operator Reordering in Data Flow Programs Through Static Code Analysis", XLDI Workshop '12
r0 = thisr1 = @parameter0 // Record/ Reco/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll/ Coll
$r5 = r1.getField(8)8)$r6 = r0.date_lb$i0 = $r5.compareTo($r6)o($r6)if $i0 < 0 goto 1
$r9 = r1.getField(8)8)$r10 = r0.date_ub$i1 = $r9.compareTo($r10)if $i1 >= 0 goto 1
r2.collect(r1)
1: return
r0 = thisr1 = @parameter0 // Record/ Reco/ Recor2 = @parameter1 // Collector/ Coll/ Coll
$d0 = $d0 = r1.getField(6)r0.extendedprice = $d0 $d0$d1 = $d1 = r1.getField(7)r0.discount = $d1
$r7 = r0.revenue // PactRecord$d2 = r0.extendedprice$d3 = r0.discount$d4 = 0 - $d3$d5 = $d2 * $d4$r7.setValue($d5)
r1.setNull(6))r1.setNull(7)r1.setNull(8)
$r8 = r0.revenuer1.setField(4, $r8)r2. 1) collect(r1)
r0 = thisr1 = @parameter0 // Iterator/ Iter/ Iterter0 /r2 = @parameter1 // Collector/ Coll/ Collter1 /
r3 = r1.next()d0 = rd0 = r3.getField(4)))
goto 2
1: r3 = r1.next()$d1 = $d1 = r3.getField(4)d0 = d0 + $d1
2: $z0 = r1.hasNext()if $z0 != 0 goto 1
r3.setField(4, d0)4, d0)
r2.collect(r3)
[0,1][0,1] [0,1][0,1] [0,1][0,1]et Ou
Use a combination of compiler and database technology to lift optimization beyond relational algebra. Derive properties of user-defined functions via code analysis and use these to mimic a relational database optimizer.
27
MapReduce Impala, ... Stratosphere
Text ✔ ✔ ✔
Aggregation ✔ ✔ ✔
ETL ✔ ✔ ✔
SQL Hive is too slow
✔ ✔
Advanced analytics
Mahout is slow and low level
Madlib is too slow
✔
mapreduce
one passdataflow
many passdataflowA fast, massively parallel
database-inspired backend.
Truly scales to disk-resident large data sets using database technology (e.g., hybrid hashing and external sort-merge for implementing key matching).
Built-in support for iterative programs via “iterate” operator: predictive and advanced analytics (machine learning, graph processing, stats) are all iterative.
Giraph is a Stratosphere program
28Stratosphere – Parallel Analytics Beyond MapReduce
Incremental Iterations: Doing Pregel
Wi Si
Match
CoGroup (left outer) N
Wi+1 Di+1
U .
Working Set has messages sent by the vertices Delta set has state of changed vertices
Aggregate messages and derive new state
Graph Topology
Create Messages from new state
29
To recap:Stratosphere is an open-source system that runs on top of Hadoop Yarn and HDFS, but replaces Hadoop MapReduce with a new runtime engine designed for iterative and DAG-shaped programs, offers a program optimizer that frees programmer from low-level decisions, is scalable to large clusters and disk-resident data sets, and is programmable in Java and Scala (and more to come).
30
A next-generation Big Data platform is being developed in Berlin.
Help us shape the future of Stratosphere!
http://www.flickr.com/photos/andiearbeit/4354455624/lightbox/