putting the sting in hive page 1 alan f. gates @alanfgates
TRANSCRIPT
© 2013 Hortonworks
Stinger Overview
Page 2
•An initiative, not a project or product• Includes changes to Hive and a new project Tez•Two main goals
–Improve Hive performance 100x over Hive 0.10–Extend Hive SQL to include features needed for analytics
•Hive will support:–BI tools connecting to Hadoop–Analysts performing ad-hoc, interactive queries–Still excellent at the large batch jobs it is used for today
© 2013 Hortonworks
Stinger Mileposts
Page 3
Stinger Phase 3• Buffer Cache• Cost Based
Optimizer
Stinger Phase 2• YARN Resource Mgmnt• Hive on Apache Tez• Query Service• Vectorized Operators
Stinger Phase 1• Base Optimizations• SQL Analytics• ORCFile Format
1 2 Improve existing tools & preserve investments
Enable Hive to support interactive workloads
Released in Hive 0.11
Current Work
Roadmap
© 2013 Hortonworks
Hive Performance Gains in 0.11
Page 4
• Enable star joins by improving Hive’s map join (aka broadcast join)–Where possible do in single map only task–When not possible push larger tables to separate tasks
• Collapse adjacent jobs where possible–Hive has lots of M->MR type plans, collapse these to MR–Collapse adjacent jobs on sufficiently similar keys when
feasible– join followed by group– join followed by order– group followed by order
• Improvements in sort merge bucket (SMB) joins
© Hortonworks Inc. 2013
Improvements in Map Joins
• TPC-DS Query 27, Scale=200, 10 EC2 nodes (40 disks)
Query 270
200
400
600
800
1000
1200
1400
16001501.895
1176.479
631.026999999999
39.985
Text
Linear (Text)
RCFile
Partitioned RCFile
Partitioned RCFile + Optimizations
Page 5
© Hortonworks Inc. 2013
Improvements in SMB Joins
• TPC-DS Query 82, Scale=200, 10 EC2 nodes (40 disks)
Query 820
500
1000
1500
2000
2500
3000
35003257.692
2862.669
255.641
71.114
Text
RCFile
Partitioned RCFile
Partitioned RCFile + Optimizations
Page 8
© 2013 Hortonworks
New Technologies in Hive
Page 9
• All covered in depth in other talks– See Owen’s, Eric’s, and Jitendra’s talk ORC File & Vectorization at 4:25 today
• Tez – A new execution engine for relational tools such as Hive– No need to use MapReduce, instead provides general DAG execution– Data moved between tasks via socket, disk, or HDFS based on performance / re-
startability trade off– Provides standing service to greatly reduce query start time
• ORCFile – A rewrite of RCFile– Columnar– Tightly integrated with Hive’s type model, including support for nested types– Much better compression – Supports projection and filter push down
• Vectorization – Rewriting operators to take advantage of modern processors– Based on work done in MonetDB– Rewrite operators to radically reduce number of function calls, branch prediction
misses, and cache misses
© Hortonworks Inc. 2013
Standard Queries
Page 10
Query 27_x000d_Scale 200
Query 82_x000d_Scale 200
Query 27_x000d_Scale
1000
Query 82_x000d_Scale
1000
0
50
100
150
200
250
300
260
165
38
77
142
296
38 42
6780
Query 27 Star JoinQuery 82 Fact Table Join
Hive 0.10, RC File
Hive 0.11 CP, RC File
Hive 0.11 CP, ORC File
© Hortonworks Inc. 2013
Performance Trajectory
Page 11
Hive 10_x000d_T
ext
Hive 10_x000d_R
C
Hive 11_x000d_R
C
Hive 11_x000d_O
RC
Hive 11 CP_x000d_O
RC, Tez_x000d_Vectoriza-
tion
0X
5X
10X
15X
20X
25X
1X2X
12X11X
21X
Query 27 Speedup
Hive 10_x000d_
Text
Hive 10_x000d_
RC
Hive 11_x000d_
RC
Hive 11_x000d_
ORC
Hive 11 CP_x000d_ORC, Tez
0X
10X
20X
30X
40X
50X
60X
70X
80X
90X
1X
14X
44X
57X
78X
Query 82 Speedup
© Hortonworks Inc. 2013
Query 12 – Demonstrating MRR
Page 12
RC File _x000d_Scale 200
ORC File _x000d_Scale 200
RC File _x000d_Scale 1000
ORC File _x000d_Scale 1000
0
10
20
30
40
50
60
70
80
55 54
75
65
35 34
55
46
Query 12 - MRR Optimization
Traditional _x000d_Map-ReduceTez Map_x000d_Reduce Reduce
Elap
sed
Tim
e (s
econ
ds)
© 2013 Hortonworks
Hive Performance Up Next
Page 13
• Push down start up time - even for queries that spend less than a second running on the cluster, there is ~15 seconds of start up time– Tez service will remove Hadoop startup issues– Need to reduce time for the metadata access– Need intelligent file caching so that hot tables can be kept in memory
• Keep working on the optimizer– Y Smart work from Ohio State University– Start using statistics to make intelligent decisions about how many mappers and
reducers to spawn – maybe in Hive, maybe in Tez– Start using statistics to choose between competing plan options
• Buffer Cache– Coordinate with HDFS team to determine caching strategy
© 2013 Hortonworks
Extending Hive SQL in 0.11
Page 14
• DECIMAL data type – for fixed precision calculation (e.g. currency)• OVER clause
– PARTITION BY, ORDER BY, ROWS BETWEEN/FOLLOWING/PRECEDING
– Works with existing aggregate functions– New analytic and window functions added
– ROW_NUMBER, RANK, DENSE_RANK, LEAD, LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTILE, CUME_DIST, PERCENT_RANK
SELECT salesperson, AVG(salesprice) OVER (PARTITION BY region ORDER BY date
ROWS BETWEEN 10 PRECEEDING AND 10 FOLLOWING)FROM sales;
© 2013 Hortonworks
Extending Hive SQL Post 0.11
Page 15
• Subqueries in WHERE– Non-correlated first– [NOT] IN first, then extend to (in)equalities and EXISTS
• Datatype conformance – Hive has Java type model, add support for SQL types:– DATE– CHAR() and VARCHAR()– add precision and scale to decimal and float– aliases for standard SQL types (BLOB = binary, CLOB = string, integer =
int, real/number = decimal)
• Security– Add security checks to views, indices, functions, etc.– Secure GRANT and REVOKE