transactional sql in apache hive

32
Transactional Operations in Hive Eugene Koifman June 2017

Upload: dataworks-summit

Post on 21-Jan-2018

987 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Transactional SQL in Apache Hive

Transactional Operations in HiveEugene Koifman

June 2017

Page 2: Transactional SQL in Apache Hive

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda

Motivations/Goals

End user point of view

Design

Performance Improvements/Results

Roadmap

Page 3: Transactional SQL in Apache Hive

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Motivations

Modifying existing data– INSERT OVERWRITE TABLE Target SELECT * FROM Target WHERE …

• Delete – OK, Update - ?• Concurrency

– Hope for the best (multiple updates)

– ZooKeeper lock manager S/X locks – restrictive

• Expensive to do repeatedly (write side)

Page 4: Transactional SQL in Apache Hive

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Motivations

Continuously adding new data to Hive in the past– ALTER TABLE Target ADD PARTITION (dt=‘2016-06-30’)

• Lots of files – bad for performance

• Fewer files –users wait longer to see latest data– INSERT INTO Target as SELECT FROM Staging

Page 5: Transactional SQL in Apache Hive

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Merge Statement – SQL Standard 2011 (Hive 2.2)

ID State County Value

1 CA LA 19.0

2 MA Norfolk 15.0

7 MA Suffolk 50.15

16 CA Orange 9.1

ID State Value

1 20.0

7 80.0

100 NH 6.0

MERGE INTO TARGET T

USING SOURCE S ON T.ID=S.ID

WHEN MATCHED THEN

UPDATE SET T.Value=S.Value

WHEN NOT MATCHED

INSERT (ID,State,Value)

VALUES(S.ID, S.State, S.Value)

ID State County Value

1 CA LA 20.0

2 MA Norfolk 15.0

7 MA Suffolk 80.0

16 CA Orange 9.1

100 NH null 6.0

Page 6: Transactional SQL in Apache Hive

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Goals

Make above use cases easy and efficient

Key Requirement– Long running analytics queries should run concurrently with update commands

NOT OLTP!!!– Support slowly changing tables

– Not for 100s of concurrent queries trying to update the same partition

Page 7: Transactional SQL in Apache Hive

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

System at High Level

A new type of table that supports Insert/Update/Delete/Merge SQL operations

Concept of ACID transaction– Atomic, Consistent, Isolated, Durable

Streaming Ingest API– Write a continuous stream of events to Hive in micro batches with transactional semantics

Page 8: Transactional SQL in Apache Hive

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

User Point of View

CREATE TABLE T(a int, b int) CLUSTERED BY (b) INTO 8 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true');

Not all tables support transactional semantics

Table must be bucketed

Table cannot be sorted

Currently requires ORC File but anything implementing format – AcidInputFormat/AcidOutputFormat

autoCommit=true

Transactions run at Snapshot Isolation– Lock in the state of the DB as of the start of the query for the duration of the query

– Between Serializable and Repeatable Read

Page 9: Transactional SQL in Apache Hive

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design

Page 10: Transactional SQL in Apache Hive

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design

Transaction Manager– Begin transaction and obtain a transaction ID

Storage layer enhanced to support MVCC architecture– Each row is tagged with unique ROW_ID (internal)

– Multiple versions of each row to allow concurrent readers and writers

– Result of each write is stored in a new Delta file

Page 11: Transactional SQL in Apache Hive

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

How ACID is implemented in Hive?

CREATE TABLE acidtbl (a INT, b STRING) CLUSTERED BY (a) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true');

ACID Metadata Columns original_transaction_idbucket_idrow_idcurrent_transaction_id

User Columns col_1: a : INT

col_2:b : STRING

ACID_PK

Page 12: Transactional SQL in Apache Hive

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

How ACID is implemented in Hive?

INSERT INTO acidtbl (a,b) VALUES (100, “foo”), (200, “xyz”), (300, “bee”);

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 1 } 200 “xyz”

{ 1, 0, 2 } 300 “bee”

delta_00001_00001/bucket_0000

Page 13: Transactional SQL in Apache Hive

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

How ACID is implemented in Hive?

UPDATE acidTbl SET b = “bar” where a = 300;

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 1 } 200 “xyz”

{ 1, 0, 2 } 300 “bee”

ACID_PK a b

{ 1, 0, 2 } 300 “bar”

delta_00001_00001/bucket_0000

delta_00002_00002/bucket_0000

Page 14: Transactional SQL in Apache Hive

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

How ACID is implemented in Hive?

DELETE FROM acidTbl where a = 200;

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 1 } 200 “xyz”

{ 1, 0, 2 } 300 “bee”

ACID_PK a b

{ 1, 0, 2 } 300 “bar”

ACID_PK a b

{ 1, 0, 1 } null null

delta_00001_00001/bucket_0000

delta_00002_00002/bucket_0000 delta_00003_00003/bucket_0000

Page 15: Transactional SQL in Apache Hive

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

How ACID is implemented in Hive?

SELECT * FROM acidtbl;

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 1 } 200 “xyz”

{ 1, 0, 2 } 300 “bee”

ACID_PK a b

{ 1, 0, 2 } 300 “bar”

ACID_PK a b

{ 1, 0, 1 } null null

delta_00001_00001/bucket_0000 delta_00002_00002/bucket_0000 delta_00003_00003/bucket_0000

{ 1, 0, 0 } 100 “foo” 100 “foo”{ 1, 0, 1 } 200 “xyz”{ 1, 0, 1 } null null{ 1, 0, 2 } 300 “bee”{ 1, 0, 2 } 300 “bar”

300 “bar”

Page 16: Transactional SQL in Apache Hive

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design - Compactor

More operations = more delta files – make reads more expensive

Page 17: Transactional SQL in Apache Hive

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design - Compactor

ALTER TABLE acidTbl COMPACT ‘MAJOR’;

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 1 } 200 “xyz”

{ 1, 0, 2 } 300 “bee”

ACID_PK a b

{ 1, 0, 2 } 300 “bar”

ACID_PK a b

{ 1, 0, 1 } null nulldelta_00001_00001/bucket_0000

delta_00002_00002/bucket_0000

delta_00003_00003/bucket_0000

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 2 } 200 “bar”

base_00003/bucket_0000

Page 18: Transactional SQL in Apache Hive

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design - Compactor

Compactor rewrites the table in the background– Minor compaction - merges delta files into fewer deltas

– Major compactor merges deltas with base - more expensive

– This amortizes the cost of updates and self tunes the tables

• Makes ORC more efficient - larger stripes, better compression

Compaction can be triggered automatically or on demand– There are various configuration options to control when the process kicks in.

– Compaction itself is a Map-Reduce job

Key design principle is that compactor does not affect readers/writers

Cleaner process – removes obsolete files

Page 19: Transactional SQL in Apache Hive

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design - Concurrency

Transaction Manager– manages transaction ID assignment

– keeps track of transaction state: open, committed, aborted

Lock Manager– DDL operations acquire eXclusive locks

– Read operations acquire Shared locks

– Also locks non transactional tables – different logic

• hive.txn.strict.locking.mode

State of both persisted in Hive Metastore

Page 20: Transactional SQL in Apache Hive

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design - Concurrency

Write Set tracking to prevent Write-Write conflicts in concurrent transactions

Note that 2 Inserts are never in conflict since Hive does not enforce unique constraints.

You are allowed to read acid and non-acid tables in same query.

You cannot write to acid and non-acid tables at the same time (multi-insert statement)

Page 21: Transactional SQL in Apache Hive

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Design - Streaming Ingest

Allows you to continuously write events to a hive table– Can commit periodically to make writes durable/visible

– Can also call abort to make writes since last commit/abort invisible.

– Optimized so that it can handle writing micro batches of events - every second.

• Multiple transactions are written to one file

– Only supports adding new data

Streaming tools like NiFi, Storm and Flume rely on this API to ingest data into hive

This API is public so it can be used directly

Data written via Streaming API has the same transactional semantics as SQL side

Page 22: Transactional SQL in Apache Hive

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Merge Statement – SQL Standard 2011 (Hive 2.2)

ID State County Value

1 CA LA 19.0

2 MA Norfolk 15.0

7 MA Suffolk 50.15

16 CA Orange 9.1

ID State Value

1 20.0

7 80.0

100 NH 6.0

MERGE INTO TARGET T

USING SOURCE S ON T.ID=S.ID

WHEN MATCHED THEN

UPDATE SET T.Value=S.Value

WHEN NOT MATCHED

INSERT (ID,State,Value)

VALUES(S.ID, S.State, S.Value)

ID State County Value

1 CA LA 20.0

2 MA Norfolk 15.0

7 MA Suffolk 80.0

16 CA Orange 9.1

100 NH null 6.0

Page 23: Transactional SQL in Apache Hive

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SQL Merge

Target

Source

ACID_PK ID State

County

Value

{ 1, 0, 1 } 1 CA LA 20.0

{ 1, 0, 3 } 7 MA Suffolk 80.0

ACID_PK ID State County

Value

{ 2, 0, 1 } 100 NH 6.0

delta_00002_00002/bucket_0000

delta_00002_00002_001/bucket_0000

Right Outer Join

ON T.ID=S.ID

Page 24: Transactional SQL in Apache Hive

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

W/o MERGE – much less efficient

UPDATE Target set Value= 20.0 where ID = 1;

UPDATE Target set Value = 80.0 where ID = 7;

INSERT INTO Target (ID, State, Value) VALUES(100, ‘NH’, 6.0);

Page 25: Transactional SQL in Apache Hive

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Performance

Page 26: Transactional SQL in Apache Hive

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Work-In-Progress

Split an update into combination of delete and insert

UPDATE acidTbl SET b = “bar” where a = 300;

ACID_PK a b

{ 1, 0, 0 } 100 “foo”

{ 1, 0, 1 } 200 “xyz”

{ 1, 0, 2 } 300 “bee”

ACID_PK a b

{ 1, 0, 2 } 300 “bar”

delta_00001_00001/bucket_0000

delta_00002_00002/bucket_0000

ACID_PK a b

{ 2, 0, 0 } 300 “bar”

ACID_PK a b

{ 1, 0, 2 } null null

delta_00002_00002/bucket_0000 delete_delta_00002_00002/bucket_0000

Enabled PPD

Splits for Delta files

Page 27: Transactional SQL in Apache Hive

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Benefits

Improved PPD

Better Network Utilization

Better Memory Utilization

Full Vectorization of Reads

Updating bucket/partition columns

Page 28: Transactional SQL in Apache Hive

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Performance

TPC-H Benchmark– 10 node cluster at Scale Factor 1000 (1 TB of data)

– 11 delta files with 90 GB data each

Page 29: Transactional SQL in Apache Hive

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future Work

Multi statement transactions, i.e. BEGIN TRANSACTION/COMMIT/ROLLBACK

Performance– Smarter Compaction

Finer grained concurrency management/conflict detection

Read Committed w/Lock Based scheduling

Better Monitoring/Alerting

LOAD DATA … support

Optional bucketing

SMB support – user defined sort order

Page 30: Transactional SQL in Apache Hive

30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Further Reading

Page 31: Transactional SQL in Apache Hive

31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Etc

Documentation– https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions

– https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest

Follow/Contribute– https://issues.apache.org/jira/browse/HIVE-

14004?jql=project%20%3D%20HIVE%20AND%20component%20%3D%20Transactions

[email protected]

[email protected]

Page 32: Transactional SQL in Apache Hive

32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You