![Page 1: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/1.jpg)
Do Relational Databases Belong in the Cloud?
Michael Stiefelwww.reliablesoftware.com
![Page 2: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/2.jpg)
How do you model data in the cloud?
![Page 3: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/3.jpg)
Relational Model
A query operation on a relation (table) produces another relation (table).
Based on the relational algebra and calculus, a query engine can produce provably correct results.
![Page 4: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/4.jpg)
Declarative Language Allows Optimization
![Page 5: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/5.jpg)
Architectural Assumption:Data Outlasts ImplementationData Separate From Code
![Page 6: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/6.jpg)
Consistency Required
Transactional consistencyNo specification of insert, update or delete.Non clustered indices consistent with data
Design consistencyDenormalized data must be kept consistentLossless join decompositions
![Page 7: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/7.jpg)
Transactional Consistency Means Holding Database Locks
![Page 8: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/8.jpg)
Holding Locks Interferes With Availability and Scalability
![Page 9: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/9.jpg)
Do Availability and Consistency Conflict?
![Page 10: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/10.jpg)
Laws of Physics Technology LimitsEconomics
![Page 11: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/11.jpg)
Laws of Physics
![Page 12: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/12.jpg)
Latency Exists
Speed of light in fiber optic cable: 124,000 miles per secondIdeal ping Japan to Boston takes 100 ms.Fetch 10 images for a web site: 1 secondIgnores Latency of the operation
![Page 13: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/13.jpg)
Bandwidth is Not Cheap
Shannon's Law: C = B log2 (1 + S / N)Capacity = bit / secondBandwidth (hertz)S/N * 5 to double capacity given bandwidth
![Page 14: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/14.jpg)
Latency is Not Bandwidth
Size of the shovel vs. how fast you can shovel
Infinite shovel capacity(bandwidth) is limited by how fast one can shovel (latency).
![Page 15: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/15.jpg)
Great Bandwidth Terrible Latency
Buy a two terabyte disk drive
Drive with it from Boston to New York
![Page 16: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/16.jpg)
You can only move data so fast
You can only move so much data
![Page 17: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/17.jpg)
Technology Limits
![Page 18: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/18.jpg)
Connectivity is Not Always Available
Cell phoneData Center OutagesEquipment UpgradesData redundancy to improve reliabilityOffline mode on client for availability
![Page 19: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/19.jpg)
Expensive to Move Data
Data naturally lives in multiple placesComputational Power gets cheaper faster than network bandwidthCheaper to compute where data is instead of moving it
Distributed Computing Economics Jim Gray
![Page 20: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/20.jpg)
Economics Dictate Scale Out, Not Up
Cheap, commodity hardware argues for spreading load across multiple servers
Relational Databases were not designed to be run on clusters (shared disk subsystem)
![Page 21: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/21.jpg)
Wind up Building a Distributed System
![Page 22: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/22.jpg)
Can the relational database scale?
![Page 23: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/23.jpg)
Traditionally, focus was on optimizing specific problems
![Page 24: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/24.jpg)
Optimize Insert/Update or Read?
Data intensive relational applications:frequent small read / writes large size reads, but infrequent writes
Problems: Heavy workloads with frequent writesScanning over large indices for queriesDirty reads can mean inconsistent data
![Page 25: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/25.jpg)
What does it mean to scale?
Large Number of UsersGeographic DistributionHugh Amounts of Data
![Page 26: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/26.jpg)
To Scale a Distributed System Focus on Data, Not Just Computation
![Page 27: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/27.jpg)
CAP Theorem
Can Have Any Two
Eric BrewerUC Berkeley, Founder Inktomi
Consistency Availability
Tolerance to Network Partitioning
http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
![Page 28: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/28.jpg)
Consistency and Availability
Single site DatabaseDatabase ClusterLDAP
Two phase commitValidate Cache
Consistency Availability
Partitioning
![Page 29: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/29.jpg)
Consistency and Partitioning
Distributed DatabaseDistributed Locking
Pessimistic LockingMinority Partitions invalid
Consistency Availability
Partitioning
![Page 30: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/30.jpg)
Availability and Partitioning
Forfeit Consistency
Google Big TableAmazon Simple DBAzure Storage Tables
Optimistic LockingCan Denormalize
Consistency Availability
Partitioning
![Page 31: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/31.jpg)
CAP Does Not Imply:
Never give up on DurabilityAtomicity within a PartitionInconsistency should be the exceptionPartition EverywhereNo ACID within a PartitionGive up on Declarative Languages such as SQL
![Page 32: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/32.jpg)
Then…
If we give up Consistency, how do we Partition?
If we Partition how do we recover system invariants?
![Page 33: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/33.jpg)
Classic Ways to Partition
![Page 34: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/34.jpg)
Distributed Objects
Distributed Objects FailSeparate Address SpaceDisparate LifetimesLocation is Not Transparent
RPC Model FailsCannot Hide Network
![Page 35: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/35.jpg)
Distributed Transactions
Relational Model works with single node/ clusterComplexity of relationsQuery plans with hundreds of options which query analyzer evaluates at runtimeNormalizationACID Transactions
Quick hardware scale up difficultTwo Phase Commit works with infinite time
![Page 36: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/36.jpg)
Better Ways to Partition
Non-Relational ApproachKey Value / Tuple StoreDocument StoreColumn Family StoreGraph Store
Relational ApproachSharding
NewSQL
![Page 37: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/37.jpg)
For Better Partitioning, Look at Data Model
![Page 38: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/38.jpg)
Relational: Given the structure of the data, what kind of questions can I ask?
Non Relational: Given the questions I want to ask, how do I structure the data?
![Page 39: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/39.jpg)
Model Application Specific Questions
![Page 40: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/40.jpg)
The aggregate is the unit of atomicity in a NoSql Data Model
![Page 41: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/41.jpg)
Relational vs. Aggregate
![Page 42: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/42.jpg)
Prioritized Query Restrictions1. How many tickets are left for an event?
date, location, event2. What events occur on which date?
date, artist, location3. When is a particular artist coming to town?
artist, location4. When can I get a ticket for a type of event?
genre
5. Which artists are coming to town?artist, location
![Page 43: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/43.jpg)
Query AnalysisMost common combination: artist or date / locationMost common query: event / date / location
Partition based on location or venueAllows for geographic sensitivity
Partitioning may or may not imply denormalization
![Page 44: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/44.jpg)
Each NoSql Data Model Treats Aggregates Differently
![Page 45: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/45.jpg)
In general….
Code has integrity constraintsCode handles joined queriesNo standard among vendors (lock in)
![Page 46: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/46.jpg)
Key-Value treats the aggregate as opaqueMight have a opaque set of attributes
Key is the index to the aggregateOrdered Key-Value allows for range queriesOnly the application knows the schema
![Page 47: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/47.jpg)
Column Family is a Two Level AggregateKeys are first levelAggregates are the second levelAggregate is composed of other aggregate
Reads are common, Writes rare
![Page 48: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/48.jpg)
Column Family Data Model (Cassandra)
Row Key
Super Column Family
Super Column 1 Super Column 2
Column 1
Value 2Value 1
Column 1
Column 2 Column 3 Column 4
Value 3 Value 4
![Page 49: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/49.jpg)
Example
Super Column Family Column
Super Column
Key
Flexible Schema
![Page 50: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/50.jpg)
Document Database has aggregate of arbitrary complexity with an index on attribute data.
![Page 51: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/51.jpg)
Mechanics of Relational Database Partitioning
![Page 52: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/52.jpg)
Find Independent Units of Data
![Page 53: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/53.jpg)
Separate Transactions From Queries
ReadCreateUpdateDelete
![Page 54: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/54.jpg)
Transactional Units Across Databases
A-Z
A-H
H-P
P-Z
Partitioning Function
![Page 55: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/55.jpg)
Partitioning Mechanisms
Horizontal PartitioningDivide table rows across databases
Vertical PartitioningDivide table columns across databasesDifferent tables in different databasesReference data can be copiedQueries scan less data
![Page 56: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/56.jpg)
Horizontal PartitioningEach table contains identical columnsData is partitioned into different databases.
Each part is referred to as a shard.Table is a single logical entity for updates and queriesIndices for a shard must be in the same shardSharding strategy based on use or query patterns
![Page 57: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/57.jpg)
Implementing Horizontal Partitions
Function that converts sharding property into a database locationPrimary keys unique across all shards
Shards hand out distinct rangesShard id is part of primary keyPool hands out unique identifiers
No secondary keys across shardsNo distributed transactions across databasesMay need to UNION query results
![Page 58: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/58.jpg)
Vertical Partitioning
Divide table columns across databasesPrimary key identical for a given "row"Data may or may not be normalizedA join across the partitions recreates the "row"
![Page 59: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/59.jpg)
Vertical Partitioning Strategy
Columns used in different queries go in different partitionsDifferent business processes "own" a table.
Leads to service oriented approach Design business processes to avoid cross table joinsTransactions within service boundary
![Page 60: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/60.jpg)
Implementing Vertical Partitions
Primary or foreign keys may be used to recreate the rowNo secondary keys across databasesSecondary indices in different partitions might divergeNormalize columns not frequently usedNo distributed transactions
![Page 61: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/61.jpg)
NewSQL
![Page 62: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/62.jpg)
New Relational Database Architectures
Examples:In-memory databasesGoogle Spanner
![Page 63: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/63.jpg)
In-memory Data Modelequivalent to relationalshort lived transactionsindex look ups (no table scans)repeated queries with different parameters
![Page 64: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/64.jpg)
Google Spanner
Globally distributed relational databaseSynchronizes with atomic and GPS clocksUses Paxos protocol for consensus
![Page 65: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/65.jpg)
Availability or Consistency ?
![Page 66: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/66.jpg)
What is the Cost of an Apology?
AmazonAirline reservationsStock TradesDeposit of a Bank CheckDeleting a photo from Flickr or Facebook
![Page 67: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/67.jpg)
Sometimes the cost is too high
AuthenticationSAML tokens expire
Launching a nuclear weapon
![Page 68: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/68.jpg)
Businesses Apologize Anyway
Vendor drops the last crystal vaseCheck bouncesDouble-entry bookkeeping requires compensation
at least 13th centuryEventually make consistent (partition healing)
![Page 69: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/69.jpg)
Software State ≠ State of the World
Software approximates the state of the worldBest guess possibleCould be wrongOther computers might disagree
![Page 70: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/70.jpg)
How consistent?
Business DecisionWhat is the cost to get it absolutely right?What is the cost of lost business?Computers can remember their guessesCan replicate to share guessesMay be cheaper to forget, and reconcile later
![Page 71: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/71.jpg)
Design For Eventual Consistency
Decouple unrelated application functionalityFocus on atomic or invariant business operations, not database reads or writes.No distributed transactionsAsynchronous processing
![Page 72: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/72.jpg)
Eventual Consistency
Different computations might come to different conclusionsDefine message based workflows for ultimate reconciliation and replication of results
![Page 73: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/73.jpg)
Not the Whole Story
Databases are not the best integration technologyObject-Relational MismatchCertain problems match other data models
![Page 74: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/74.jpg)
Services, not Data, Outlast Implementation
![Page 75: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/75.jpg)
Application or Service Specific Databases
![Page 76: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/76.jpg)
Case Study: Amazon Four Day Outage
![Page 77: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/77.jpg)
Facts
April 21, 2011 One Day of Stabilization, Three Days of RecoveryProblems: EC2, EBS, Relational Database ServiceAffected: Quora, Hootsite, Foursquare, RedditUnaffected: Netflix, Twillo
![Page 78: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/78.jpg)
Netflix Explicitly Architected For Failure
![Page 79: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/79.jpg)
Although more errors, higher latency, no increase in customer service calls or inability to
find or start movies.
![Page 80: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/80.jpg)
Key Architectural Decisions
Stateless ServicesData stored across isolation zones
Could switch to hot standby
Had Excess Capacity (N + 1)Handle large spikes or transient failures
Used relational databases only where needed.Could partition data
Degraded Gracefully
![Page 81: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/81.jpg)
Data ArchitectureSeparate databases:
User, Accounts, Feedback, TransactionsSplit by primary access pathNo business logic in databaseCPU intensive work in service tier
Referential Integrity, Joins, SortingAvoids deadlock
![Page 82: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/82.jpg)
Degraded Gracefully
Fail Fast, Aggressive TimeoutsCan degrade to lower quality service
no personalized movie list, still can get list of available movies
Non Critical Features can be removed.
![Page 83: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/83.jpg)
Suggested Reading
"Life Beyond Distributed Transactions: An Apostate's View" by Pat Helland
![Page 84: Do Relational Databases Belong In the Cloudreliablesoftware.com/presentations/Do Relational... · 2014. 1. 6. · Relational Databases were not designed to be run on clusters (shared](https://reader033.vdocument.in/reader033/viewer/2022060903/609f26beaf843d006e267822/html5/thumbnails/84.jpg)
Conclusions
Scalability means Users, Bandwidth, GeographyPartitioning Changes the Data ModelService Orientation Changes the Data ModelDesign for Eventual ConsistencyNo need for scalability or service orientation,
Relational Model worksUnified Data Model makes it hard to meet rapid change.