OmidEfficient Transaction Management and
Incremental Processing for HBase
Copyright © 2013 Yahoo! All rights reserved. No reproduction or distribution allowed without express written permission.
Daniel Gómez Ferro
21/03/2013
2Omid - Yahoo! Research
About me
Research engineer at Yahoo!
Main Omid developer
Apache S4 committer
Contributor to:• ZooKeeper• HBase
3Omid - Yahoo! Research
Outline
Motivation
Goal: transactions on HBase
Architecture
Examples
Future work
4Omid - Yahoo! Research
Motivation
Traditional DBMS:• SQL• Transactions• Scalable
NoSQL stores:• SQL• Transactions• Scalable
Some use cases require • Scalability• Consistent updates
5Omid - Yahoo! Research
Motivation
Requested feature• Support for transactions in HBase is a common request
Percolator• Google implemented transactions on BigTable
Incremental Processing• MapReduce latencies measured in hours• Reduce latency to seconds• Requires transactions
6Omid - Yahoo! Research
Incremental Processing
State0 State1SharedState
OfflineMapReduce
OnlineIncremental Processing
Fault toleranceScalability
Fault toleranceScalability
Shared state
Using Percolator, Google dramatically reduced crawl-to-index time
Transactions on HBase
8Omid - Yahoo! Research
Omid
‘Hope’ in Persian
Optimistic Concurrency Control system• Lock free• Aborts transactions at commit time
Implements Snapshot Isolation
Low overhead
http://yahoo.github.com/omid
9Omid - Yahoo! Research
Snapshot Isolation
Transactions read from their own snapshot
Snapshot:• Immutable• Identified by creation time• Values committed before creation time
Transactions conflict if two conditions apply:A) Overlap in time
B) Write to the same row
10Omid - Yahoo! Research
Transaction conflicts
r1 r2 r3 r4Time
Transaction B overlaps with A and C• B ∩ A = ∅• B ∩ C = { r4 }• B and C conflict
Transaction D doesn’t conflict
Write set:
r3 r4
r2 r4
r1 r3
Id:
A
B
C
D
Time overlap:
Omid Architecture
12
OmidClientsOmid
Clients
Architecture
1. Centralized server
2. Transactional metadata replicated to clients
3. Store transaction ids in HBase timestamps
HBaseRegionServers
Status Oracle (SO)
HBaseRegionServers
HBaseRegionServers
HBaseRegionServers
(1) (2)
(3)
Omid - Yahoo! Research
OmidClients
SO
13Omid - Yahoo! Research
Status Oracle
Limited memory• Evict old metadata• … maintaining consistency guarantees!
Keep LowWatermark • Updated on metadata eviction• Abort transactions older than LowWatermark• Necessary for Garbage Collection
14Omid - Yahoo! Research
Transactional Metadata
Status Oracle
Row LastCommitTS
r1 T20
… …
StartTS CommitTS
T10 T20
… …
Aborted
T2 T18 …
LowWatermark
T4
15Omid - Yahoo! Research
Metadata Replication
Transactional metadata is replicated to clients
Status Oracle not in critical path• Minimize overhead
Clients guarantee Snapshot Isolation• Ignore data not in their snapshot• Talk directly to HBase• Conflicts resolved by the Status Oracle
Expensive, but scalable up to 1000 clients
16
OmidClientsOmid
Clients
Architecture recap
HBaseRegionServers
HBaseRegionServers
HBaseRegionServers
HBaseRegionServers
Omid - Yahoo! Research
OmidClients
SO
Row LastCommitTS
r1 T20
… …
StartTS CommitTS
T10 T20
… …
Aborted
T2 T18 …
LowWatermark
T4
Status Oracle
17Omid - Yahoo! Research
Architecture + Metadata
Status Oracle
R LC
r1 20
ST CT
10 20
Abort
…
LowW
…
ST CT
10 20
Abort
…
LowW
…
Client
Conflict detection(not replicated)
HBase
row TS val
r1 10 hi
HBase values tagged with Start TimeStamp
18Omid - Yahoo! Research
Comparison with Percolator
Omid
ClientsHBase
StatusOracle
Percolator
ClientsBigTable
TS Server
19Omid - Yahoo! Research
Simple API
TransactionManager:Transaction begin();
void commit(Transaction t) throws RollbackException;
void rollback(Transaction t);
TTable:Result get(Transaction t, Get g);
void put(Transaction t, Put p);
ResultScanner getScanner(Transaction t, Scan s);
Example Transaction
21Omid - Yahoo! Research
ST CT
10 20
Abort
…
LowW
…
Client
Transaction Example
Status Oracle
R LC
r1 20
ST CT
10 20
Abort
…
LowW
…
HBase
row TS val
r1 10 hi
>> |
22Omid - Yahoo! Research
Status Oracle
ST CT
10 20
Abort
…
LowW
…
Client
Transaction Example
R LC
r1 20
ST CT
10 20
Abort
…
LowW
…
>> t = TM.begin()Transaction { ST: 25 }>>
t ST: 25
|
HBase
row TS val
r1 10 hi
23Omid - Yahoo! Research
Status Oracle
ST CT
10 20
Abort
…
LowW
…
Client
Transaction Example
R LC
r1 20
ST CT
10 20
Abort
…
LowW
…
>> t = TM.begin()Transaction { ST: 25 }>> TTable.get(t, ‘r1’)‘hi’>>
t ST: 25
|
HBase
row TS val
r1 10 hi
24Omid - Yahoo! Research
Status Oracle
ST CT
10 20
Abort
…
LowW
…
Client
Transaction Example
R LC
r1 20
ST CT
10 20
Abort
…
LowW
…
>> t = TM.begin()Transaction { ST: 25 }>> TTable.get(t, ‘r1’)‘hi’>> TTable.put(t, ‘r1’, ‘bye’)>>
t ST: 25R: {r1}
|
HBase
row TS val
r1 10 hi
r1 25 bye
25Omid - Yahoo! Research
Status Oracle
ST CT
10 20
Abort
…
LowW
…
Client
Transaction Example
R LC
r1 30
ST CT
25 30
Abort
…
LowW
20
>> TTable.get(t, ‘r1’)‘hi’>> TTable.put(t, ‘r1’, ‘bye’)>> TM.commit(t)Committed{ CT: 30 }>>
t ST: 25R: {r1}
|
HBase
row TS val
r1 10 hi
r1 25 bye
26Omid - Yahoo! Research
Status Oracle
ST CT
10 20
Abort
…
LowW
…
Client
Transaction Example
Abort
15
>> t_oldTransaction{ ST: 15 }>> TT.put(t_old, ‘r1’, ‘yo’)>> TM.commit(t_old)Aborted>>
t_oldST: 15R: {r1}HBase
row TS val
r1 15 yo
r1 25 bye
R LC
r1 30
ST CT
25 30
LowW
…
|
Centralized Approach
27Omid - Yahoo! Research
28Omid - Yahoo! Research
Omid Performance
Centralized server = bottleneck?• Focus on good performance• In our experiments, it never became the bottleneck
512 rows/t 1K TPS10 ms
2 rows/t100K TPS
5 ms
29Omid - Yahoo! Research
Fault Tolerance
Omid uses BookKeeper for fault tolerance• BookKeeper is a distributed Write Ahead Log
Recovery: reads log’s tail (bounded time)
Downtime < 25 s
Example Use Case
News Recommendation System
31Omid - Yahoo! Research
News Recommendation System
Model-based recommendations• Similar users belong to a cluster• Each cluster has a trained model
New article• Evaluate against clusters• Recommend to users
User actions• Train model• Split cluster
32Omid - Yahoo! Research
Transactions Advantages
All operations are consistent• Updates and queries
Needed for a recommendation system?• We avoid corner cases• Implementing existing algorithms is straightforward
Problems we solve• Two operations concurrently splitting a cluster• Query while cluster splits• …
33Omid - Yahoo! Research
Example overview
Index
Articles & User actions
Queries
IndexDataStoreWor
kers
Updates
34Omid - Yahoo! Research
Other use cases
Update indexes incrementally• Original Percolator use case
Generic graph processing
Adapt existing transactional solutions to HBase
Future Work
36Omid - Yahoo! Research
Current Challenges
Load isn’t distributed automatically
Complex logic requires big transactions• More likely to conflict• More expensive to retry
API is low level for data processing
37Omid - Yahoo! Research
Future work
Framework for Incremental Processing
• Simpler API
• Trigger-based, easy to decompose operations
• Auto scalable
Improve user friendliness• Debug tools, metrics, etc
Hot failover
Questions?
All time contributorsBen Reed
Flavio Junqueira
Francisco Pérez-Sorrosal
Ivan Kelly
Matthieu Morel
Maysam Yabandeh
Partially supported byCumuloNimbo project (ICT-257993)
funded by the European Commission
Try it!yahoo.github.com/omid
Backup
39Omid - Yahoo! Research
40Omid - Yahoo! Research
Commit Algorithm
Receive commit request for txn_i
Set CommitTimestamp(txn_i) <= new timestamp
For each modified row:• if LastCommit(row) > StartTimestamp(txn_i) => abort
For each modified row:• Set LastCommit(row) <= CommitTimestamp(txn_i)