hive on steroid

© Hortonworks Inc. 2011© Hortonworks Inc. 2013

Hive on steroidProject stinger

© Hortonworks Inc. 2013

Who Am I?

• Olivier Renault

• Hortonworks Solution engineer for EMEA– Join Hortonworks EMEA in Jan 2013

• Eucalyptus – Open source Cloud solution

• Red Hat – Solution engineer


What’s Hive ?

• Use HiveQL

• Hive translate SQL query into MapReduce job using

• De facto SQL interface in Hadoop

• Entry point for most BI tools– ODBC

• HCatalog merge with Hive– Metadata server

• Hive is able to query Pb of data


Hive: Strength Through Community

Loyal Open Source Community and Real Corporate Interest/Contributions

FacebookTeradata

SAPIntel

MicrosoftHuaweiYahoo

…

Dozens of Vendorsintegrate with Hive

TeradataMicrosoft

MicrostrategyTableau

KarmasphereDatameer

Information BuildersSAP

OracleActuateQlikView

SASarcplanPentaho

JaspersoftTibcoTalend

Informatica…

Open Source

End Users

Vendors

Countless Enterprises Use Hive as the defacto SQL interface

to Hadoop data


Problem : Hive was slow …

• Hive is able to interact with visualization tools but you needed to be patient …

• February 2013, Hortonworks launch Stinger initiative. The aim is to improve Hive performance by 100x

• Bringing Hive in the interactive query world


Stinger Initiative

• Community initiative around Hive• Enables Hive to support interactive workloads• Enhances Hive’s standard SQL interface for Hadoop• Improves existing tools & preserves investments

Query Planner

Hive

Execution Engine

Tez

= 100X+ +File

Format

ORC file


Stinger Project(announced February 2013)

Batch AND Interactive SQL-IN-Hadoop

Stinger InitiativeA broad, community-based effort to drive the next generation of HIVE

Phase Three• Hive on Apache Tez• Query Service• Buffer Cache• Cost Based Optimizer (Optiq)• Vectorized Processing

Phase One

• Base Optimizations• SQL Analytic Functions• ORCFile, Modern File Format

Phase Two

• VARCHAR, DATE Types• ORCFile predicate pushdown• Advanced Optimizations• Performance Boosts via YARN

SpeedImprove Hive query performance by 100X to allow for interactive query times (seconds)

ScaleThe only SQL interface to Hadoop designed for queries that scale from TB to PB

SQLSupport broadest range of SQL semantics for analytic applications running against Hadoop

…all IN Hadoop

Goals:Deli

vered

Hive 0.

11

(HDP 1.

3)

Delive

red

Hive 0.

12

(HDP 2.

0)

Coming Soon


Hive : Base optimizationNew dags, analytics tools, ..


Hive Advanced Analytics

• Add OVER clause to support windowing queries– With standard arguments– Ranking functions

– rank, ntile, row_number, dense_rank

– With analytics functions:– cume_dist, first_value, lag, last_value, lead, percentile_cont, percentile_disc,

percent_rank

• Add CUBE and ROLLUP– Easily create summaries of your data

• Extend aggregation functions– STDDEV, VAR


Hive Data Type Conformance

• Extend Hive to support additional types from SQL– Improves applications and interoperability between tools

• Specific additions– Add fixed point NUMERIC and DECIMAL type (in progress)– Add VARCHAR and CHAR types with limited field size– Add DATETIME– Add size ranges from 1 to 53 for FLOAT– Add synonyms for compatibility

– BLOB for BINARY

– TEXT for STRING

– REAL for FLOAT


SQL: Enhancing SQL Semantics

Hive SQL Datatypes Hive SQL SemanticsINT SELECT, INSERTTINYINT/SMALLINT/BIGINT GROUP BY, ORDER BY, SORT BYBOOLEAN JOIN on explicit join keyFLOAT Inner, outer, cross and semi joinsDOUBLE Sub-queries in FROM clauseSTRING ROLLUP and CUBETIMESTAMP UNIONBINARY Windowing Functions (OVER, RANK, etc)DECIMAL Custom Java UDFsARRAY, MAP, STRUCT, UNION Standard Aggregation (SUM, AVG, etc.)DATE Advanced UDFs (ngram, Xpath, URL) VARCHAR Sub-queries in WHERE, HAVINGCHAR Expanded JOIN Syntax

SQL Compliant Security (GRANT, etc.)

INSERT/UPDATE/DELETE (ACID)

Hive 0.12

Available

Roadmap

SQL ComplianceHive 12 provides a wide array of SQL datatypes and semantics so your existing tools integrate more seamlessly with Hadoop


Example Benchmark Spec

• The TPC-DS benchmark data+query set

• Query 27 – big table(store_sales) joins lots of small tables– A.K.A Star Schema Join

• What does Query 27 do?For all items sold in stores located in specified states during a given year, find the average quantity, average list price, average list sales price, average coupon amount for a given gender, marital status, education and customer demographic..


SELECT col5, avg(col6)

FROM store_sales_fact ssf

join item_dim on (ssf.col1 = item_dim .col1)

join date_dim on (ssf.col2 = date_dim.col2

join custdmgrphcs_dim on (ssf.col3 =custdmgrphcs_dim.col3)

join store_dim on (ssf.col4 = store_dim.col4)

GROUP BY col5

ORDER BY col5

LIMIT 100;

Query 27 - Star Schema Join

• Derived from TPC-DS Query 27

41 GB

58 MB

11MB

80MB

106 KB


New Query Planner


Query27 Execution Before Hive 11-Text Format

Query spawned 5 MR Jobs

The intermediate output of each job is written to HDFS

Query Response Time

179 total mappers got executed


Query27 Execution With Hive 11-Text Format

Query spawned of 1 job with Hive 11 compared to 5 MR Jobs with Hive 10

Job 1 of 1 – Each Mapper loads into memory the 4 small dimension tables and streams parts of the large fact table. Joins then occur in Mapper hence the name MapJoin

Increase in performance with Hive 11 as query time went down from 21 minutes to about 4 minutes


Query27 Execution With Hive 11- RC Format

Conversion from Text to RC file format decreased size of dimension data set from 38 GB to 8.21 GB

Smaller file equates to less IO causing the query time to decrease from 246 seconds to 136 seconds


Query27 Execution With Hive 11- ORC Format

ORC File type consolidates data more tighly than RCFile as the size of dataset decreased from 8.21 GB to 2.83 GB

Smaller file equates to less IO causing the query time to decrease from 136 seconds to 104 seconds


Summary of Results

File Type Number of MR Jobs

Input Size Mappers Time

Text/Hive 10 5 43.1 GB 179 1260 Seconds

Text/Hive 11 1 38 GB 151 246 seconds

RC/Hive 11 1 8.21 GB 76 136 seconds

ORC/Hive 11 1 2.83 GB 38 104 seconds

RC/Hive 11/Partitioned/Bucketed

1 1.73 GB 19 104 seconds

ORC/Hive 11/Partitioned/Bucketed

1 687 MB 27 79.62


ORC file formatOptimized RC File


ORCFile - Optimized Column Storage

• Make a better columnar storage file– Evolve based on Google Dremel format

• Decompose complex row types into primitive fields– Better compression and projection

• Only read bytes from HDFS for the required columns.• Store column level aggregates in the files

– Only need to read the file meta information for common queries– Stored both for file and each section of a file– Aggregates: min, max, sum, average, count– Allows fast access by sorted columns

• Ability to add bloom filters for columns– Enables quick checks for whether a value is present

Page 24


ORCFile - File Layout

Page 25


Interactive Query at Scale

Sustained Query TimesApache Hive 0.12 provides sustained acceptable query times even at petabyte scale

131 GB(78% Smaller)

File Size Comparison Across Encoding MethodsDataset: TPC-DS Scale 500 Dataset

221 GB(62% Smaller)

Encoded withText

Encoded withRCFile

Encoded withORCFile

Encoded withParquet

505 GB(14% Smaller)

585 GB(Original Size) • Larger Block Sizes

• Columnar format arranges columns adjacent within the file for compression & fast access

Impala

Hive 12

Smaller FootprintBetter encoding with ORC in Apache Hive 0.12 reduces resource requirements for your cluster


Moving Hadoop Beyond MapReduce

• Low level data-processing execution engine• Built on YARN

• Enables pipelining of jobs• Removes task and job launch times• Does not write intermediate output to HDFS

– Much lighter disk and network usage

• New base of MapReduce, Hive, Pig, Cascading etc.• Hive and Pig jobs no longer need to move to the end of the queue

between steps in the pipeline


FastQuery: Beyond Batch with YARN

Page 29

Tez Generalizes Map-ReduceSimplified execution plans process

data more efficiently

Always-On Tez ServiceLow latency processing forall Hadoop data processing


Apache Tez as the new Primitive

HADOOP 1.0

HDFS(redundant, reliable storage)

MapReduce(cluster resource management

& data processing)

Pig(data flow)

Hive(sql)

Others(cascading)

HDFS2(redundant, reliable storage)

YARN(cluster resource management)

Tez(execution engine)

HADOOP 2.0

Data FlowPig

SQLHive

Others(cascading)

BatchMapReduce Real Time

Stream Processing

Storm

Online Data

ProcessingHBase,

Accumulo

MapReduce as Base Apache Tez as Base


Hive – MR Hive – Tez

Hive-on-MR vs. Hive-on-TezSELECT a.x, AVERAGE(b.y) AS avg FROM a JOIN b ON (a.id = b.id) GROUP BY aUNION SELECT x, AVERAGE(y) AS AVG FROM c GROUP BY xORDER BY AVG;

SELECT a.state

JOIN (a, c)SELECT c.price

SELECT b.id

JOIN(a, b)GROUP BY a.state

COUNT(*)AVERAGE(c.price)

M M M

R R

M M

R

M M

R

M M

R

HDFS

HDFS

HDFS

M M M

R R

R

M M

R

R

SELECT a.state,c.itemId

JOIN (a, c)

JOIN(a, b)GROUP BY a.state

COUNT(*)AVERAGE(c.price)

SELECT b.id

Tez avoids unneeded writes to

HDFS


Speed: Interactive Query In Hadoop

Page 32

Hive 10 Trunk (Phase 3)Hive 0.11 (Phase 1)

190xImprovement

1400s

39s

7.2sTPC-DS Query 27

3200s

65s

14.9s

TPC-DS Query 82

200xImprovement

Query 27: Pricing Analytics using Star Schema Join Query 82: Inventory Analytics Joining 2 Large Fact Tables

All Results at Scale Factor 200 (Approximately 200GB Data)

Test Cluster:• 200 GB Data (ORCFile)• 20 Nodes, 24GB RAM

each, 6x disk each


Thank Youhortonworks.comhortonworks.com/sandbox

hive on steroid

Documents