cs165 section niv dayan - harvard universitydaslab.seas.harvard.edu/.../s11_acid_intro.pdf · nosql...
TRANSCRIPT
![Page 1: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/1.jpg)
CS165 Section
Niv Dayan
Demystifying the Zoo of Contemporary Database Systems
1
![Page 2: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/2.jpg)
Introduction
1980 1990 2000 2010
2Time
![Page 3: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/3.jpg)
Introduction
1980 1990 2000 2010
3Time
![Page 4: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/4.jpg)
Introduction
• Different architectures
– Performance
– Data integrity
– User interface
4
![Page 5: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/5.jpg)
Introduction
• Different architectures
– Performance
– Data integrity
– User interface
5
![Page 6: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/6.jpg)
Introduction
• Theme: any trend in database technology can be traced to a trend in hardware
• Claim: The new database technologies are adaptations to changes in hardware
Database designer Hardware
6
![Page 7: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/7.jpg)
DBHistory
7
![Page 8: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/8.jpg)
History
• 3 goals of database design
– Speed
– Affordability
– Resilience to system failure
• How you achieve them depends on hardware
8
![Page 9: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/9.jpg)
History
• Two storage media:
Main Memory Fast, expensive, volatile
DiskSlow, cheap, non-volatile
9
![Page 10: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/10.jpg)
History
• How should data be stored across them?
• Main memory is volatile and expensive
Frequently accessed data is here
All data is here
10
![Page 11: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/11.jpg)
History
• To make a system fast, address bottleneck
• Disk is extremely slow
Fetch Retrieve
Main memory
Disk11
![Page 12: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/12.jpg)
History
• To make a system fast, address bottleneck
• Disk is extremely slow
Fetch Retrieve
Pluto
Earth
Fetch Retrieve
Main memory
Disk12
![Page 13: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/13.jpg)
History
• Why so slow?
• Two questions:– Question 1: How to minimize disk access?
– Question 2: What to do during a disk access?
Disk hand moving
13
![Page 14: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/14.jpg)
History
• Problem: How to minimize disk accesses?
• Solution: Store data that is frequently co-accessed at the same physical location
• Consolidates many disk accesses to one
Data item 1
Data item 2
14
![Page 15: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/15.jpg)
ID Name Balance
1 Bob 1002 Sara 1003 Will 450
History
• Example: Bank
• Co-locate all information about each customer
• Customer Sara deposits $100
Main Memory
Disk
Database
2 Sara 100
Add 1002 Sara 200
2 disk accesses, since data about sara is co-located
15
![Page 16: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/16.jpg)
History
• What to do during a disk access?
• Start running the next operation(s)
• Improves performance
• But data can get corrupted16
![Page 17: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/17.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 0
17
![Page 18: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/18.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 0
balance = 0
Fetch
Retrieve
18
![Page 19: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/19.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 0
balance = 0
19
![Page 20: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/20.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 0
balance = 0 balance = 0
Fetch
Retrieve
20
![Page 21: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/21.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 0
balance = 0 balance = 0
21
![Page 22: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/22.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 0
balance = 100 balance = 100
22
![Page 23: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/23.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 100
balance = 100 balance = 100
write write
23
![Page 24: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/24.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 100
24
![Page 25: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/25.jpg)
History
• A couple, Bob and Sara, share a bank account
• Both deposit $100 at same time
balance = 100
Account balance should be 200!Bob and Sara lost money.
25
![Page 26: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/26.jpg)
History
• Question: how to achieve concurrency while maintaining data integrity?
• Insight: transactions can be concurrent, as long as they don’t modify the same data
• Solution: locking– Bob locks data, modifies it, releases lock
– Sara waits until lock is released
• Downside: – transactions may need to wait for locks.
26
![Page 27: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/27.jpg)
History
• 3 goals of database design
– Speed
– Affordability
– Resilience to system failure
27
![Page 28: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/28.jpg)
History
• 3 goals of database design
– Speed
–Affordability
– Resilience to system failure
28
![Page 29: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/29.jpg)
History
• Disk was cheap, but not so cheap
• 1 gigabyte for $10000 in 1980
• Avoid storing replicas of same data
ID name account-ID balance
1 Bob 1 100
2 Sara 1 100
3 Trudy 2 450
29
![Page 30: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/30.jpg)
History
• Solution: “Normalization”. Break tables.
ID name account-ID balance
1 Bob 1 100
2 Sara 1 100
3 Trudy 2 450ID name account-ID
1 Bob 1
2 Sara 1
3 Trudy 2
ID balance
1 100
2 450
Customers Accounts
• Bonus: Easier to maintain data integrity 30
![Page 31: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/31.jpg)
History
• Normalization: – Saves storage space
– Easier to maintain data integrity
• Downside: reads are more expensive
– Need to join tables
ID name account-ID
1 Bob 1
2 Sara 1
3 Trudy 2
ID balance
1 100
2 450
Customers Accounts
31
![Page 32: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/32.jpg)
History
• Data is decomposed accross tables
• Query Language: SQL
– select balance from Customers c, Accounts a where c.account-ID = a.ID and c.name = “Bob”
ID name account-ID
1 Bob 1
2 Sara 1
3 Trudy 2
ID balance
1 100
2 450
Customers Accounts
32
![Page 33: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/33.jpg)
History
• 3 goals of database design
– Speed
–Affordability
– Resilience to system failure
33
![Page 34: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/34.jpg)
History
• 3 goals of database design
– Speed
– Affordability
–Resilience to system failure
34
![Page 35: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/35.jpg)
History
• Many things can go wrong
– Power failure
– Hardware failure
– Natural disaster
• Data is precious (e.g. bank)
• Provide recovery mechanism
35
![Page 36: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/36.jpg)
Sara’sbalance = 100Sara’sbalance = 100
History
• Example: Sara transfers $100 to Anna
• Power stops in the middle
Anna’s balance = 450
Sara’sbalance = 0
36
![Page 37: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/37.jpg)
Sara’sbalance = 100
History
• Example: Sara transfers $100 to Anna
• Power stops in the middle
Anna’s balance = 450
Sara’sbalance = 0
37
![Page 38: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/38.jpg)
Anna’s balance = 450Anna’sbalance = 450
Sara’sbalance = 0
History
• Example: Sara transfers $100 to Anna
• Power stops in the middle
Anna’sbalance = 550
38
![Page 39: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/39.jpg)
Anna’s balance = 450Anna’s balance = 450
Sara’sbalance = 0
History
• Example: Sara transfers $100 to Anna
• Power stops in the middle
Anna’sbalance = 550
At this point, power fails
39
![Page 40: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/40.jpg)
History
• Transaction: a sequence of operations all takes place, or none take place.
• Transactions should be atomic
40
![Page 41: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/41.jpg)
History
• Problem: how to guarantee atomicity?
• Solution: use a log (on disk )
• All data changes are recorded in the log
• After power failure, examine log
• Undo changes by unfinished transactions
Data Log
41
![Page 42: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/42.jpg)
History
• Data integrity– Concurrency (fix with locking)
– System failure (fix with logging)
• ACID– Atomicity
– Consistency
– Isolation
– Durability
42
![Page 43: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/43.jpg)
History
• Summary– Speed
– Affordability
– Resilience to system failure
• Relational databases:– Normalize data into multiple tables
– ACID (locking & logging)
– SQL
• Design decisions are motivated by hardware43
![Page 44: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/44.jpg)
Today
44
![Page 45: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/45.jpg)
Today
• What changed in hardware?
• How does it affect database design?
45
![Page 46: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/46.jpg)
Today
• Disk is 107 times cheaper
• Main memory is 106 times cheaper
46
![Page 47: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/47.jpg)
Today
• Disk is now dirt cheap
• Organizations keep all historical data
• Business intelligence
• E.g. Amazon
– revenues from product X on date Y
– which products are bought together
47
![Page 48: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/48.jpg)
Today
• Traditional system architecture:
End usersTransactional
DatabaseBusiness
Intelligence
transactionsAnalytical
queries
48
![Page 49: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/49.jpg)
Today
• Problem:
–Analytical queries are expensive • Touch a lot of data
• Disk access
• Locks
–They slow down transactions.
–End-users wait longer
49
![Page 50: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/50.jpg)
Today
• Solution: split database
End usersTransactional
DatabaseData
warehouseBusiness
Intelligence
50
![Page 51: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/51.jpg)
Today
• Different workloads
• Different internal design
Transactional Database Data
warehouse51
![Page 52: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/52.jpg)
Today
• Example analytical queries
– How long is delivery? (2 columns, all rows)
– Revenue from product X? (2 columns, all rows)
• Problem:
– Data is stored row by row
orderid
custid
productid
price orderdate
receiptdate
priority status comment
... ... ... ... ... ... ... ... ...
52
![Page 53: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/53.jpg)
• Solution: column-store – Each column is stored separately
– Good for analytical queries
– Changes entire architecture
– Examples: Vertica, Vectorwise, Greenplum, etc.
orderid
custid
status price orderdate
receiptdate
priority clerk comment
... ... ... ... ... ... ... ... ...
Today
53
![Page 54: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/54.jpg)
Today
• How are transactional databases affected by hardware changes?
Transactional Database
54
![Page 55: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/55.jpg)
Today
• Main memory is cheaper
• Terabytes are affordable
• Enough to store all transactional data
• E.g. Amazon
– Products list
– User accounts
55
![Page 56: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/56.jpg)
Today
• Main memory was expensive
• Now it’s cheaper
Data Log
56
![Page 57: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/57.jpg)
Today
• Transactional databases are main memory databases
• Bottleneck used to be disk access
• The new bottleneck is ACID (logging, locking)
Useful work
ACID
Disk access
Useful work
ACID
57
![Page 58: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/58.jpg)
Today• More challenges
• Due to internet, 100% availability is key
• Data is replicated
…
…
58
![Page 59: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/59.jpg)
Today
• Joins become more expensive
ID name account-ID
1 Bob 1
2 Sara 1
3 Trudy 2
ID balance
1 100
2 450
Customers Accounts
Joins
59
![Page 60: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/60.jpg)
Today
• Replication and locks become more expensive
Accounts Accounts
Replication, locks
ID balance
1 100
2 450
ID balance
1 100
2 450
60
![Page 61: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/61.jpg)
Today
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
61
![Page 62: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/62.jpg)
Today
• NoSQL and NewSQL address these
–NoSQL simplifies
–NewSQL engineers
62
![Page 63: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/63.jpg)
NoSQL(MongoDB)
63
![Page 64: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/64.jpg)
NoSQL(MongoDB)
64http://db-engines.com/en/ranking
![Page 65: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/65.jpg)
NoSQL
• Name popularized in 2009
• Conference on “open source distributed non-relational databases”
• NoSQL was a hashtag
65
![Page 66: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/66.jpg)
NoSQL
• Different types
– Document stores
– Column-oriented
– Key-value-stores
– Graph databases
Similar
Different
66
![Page 67: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/67.jpg)
NoSQL
• MongoDB - Main decisions
1. No joins
• Aggregate related data into “documents”• Reduces network traffic
• Data modeling is harder
2. No ACID
• Faster
• Concurrency & system failure can corrupt data
67
![Page 68: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/68.jpg)
NoSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
68
![Page 69: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/69.jpg)
NoSQL
• To avoid joins, data is de-normalized
ID name account-ID
1 Bob 1
2 Sara 1
3 Trudy 2
ID balance
1 100
2 450
Customers Accounts
ID name account-ID balance
1 Bob 1 100
2 Sara 1 100
3 Trudy 2 450 69
![Page 70: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/70.jpg)
NoSQL
• In MongoDB
• db.customers.find(name:“Sara”)
Collection of customer Documents
{ name: “Bob”,account-ID: 1,balance: 100 }
{name: “Sara”,account-ID: 1,balance: 100 }
{ name: “Trudy”,account-ID: 2,balance: 450 }
70
![Page 71: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/71.jpg)
NoSQL
• Documents are flexible
{ name: “Bob”,account-ID: 1,balance: 100, favorite-color: “red”credit-score: 3.0
}
{name: “Sara”,account-ID: 1,balance: 100 hobbies: [“rowing”, “running”]
}
71
![Page 72: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/72.jpg)
NoSQL
• Main point: no need for joins
• All related data is in one place
72
{ name: “Bob”,account-ID: 1,balance: 100, favorite-color: “red”credit-score: 3.0
}
![Page 73: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/73.jpg)
NoSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
73
![Page 74: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/74.jpg)
NoSQL
• MongoDB does not lock
• Recall Bob and Sara
• Deposit $100 at same time to shared account
• Overwrite each other’s update
No general way to prevent this
74
![Page 75: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/75.jpg)
NoSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
75
![Page 76: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/76.jpg)
NoSQL
• Eventual consistency
• Different operation order across replicas
• E.g. concurrent addition and multiplication
DepositAdd 100
Interest Multiply by 1.1
{…balance: 100
... }
{…balance: 100
... }76
![Page 77: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/77.jpg)
NoSQL
• Eventual consistency
• Different operation order across replicas
• E.g. concurrent addition and multiplication
DepositAdd 100
Interest Multiply by 1.1
{…balance: 100
... }
{…balance: 100
... }77
![Page 78: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/78.jpg)
NoSQL
• Eventual consistency
• Different operation order across replicas
• E.g. concurrent addition and multiplication
DepositAdd 100
Interest Multiply by 1.1
Inconsistent replicas
{…balance: 220
... }
{…balance: 210
... }78
![Page 79: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/79.jpg)
NoSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
79
![Page 80: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/80.jpg)
NoSQL
• When to use MongoDB?
• Non-interacting entities
– No sharing (e.g. bank account)
– No exchanging (e.g. money transfers)
• Commutative operations on data
• You need a flexible data model
80
![Page 81: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/81.jpg)
NewSQL
81
![Page 82: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/82.jpg)
NewSQL
• ACID & good performance
• Redesign internal architecture.
82
![Page 83: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/83.jpg)
NewSQL
• Data is normalized into multiple tables
• Tables partitioned and replicated across machines
ID name account-ID
1 Bob 1
2 Sara 1
3 Trudy 2
ID balance
1 100
2 450
Customers Accounts
83
![Page 84: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/84.jpg)
NewSQL
• Single machine bottlenecks:
• Multiple machines bottlenecks:
Replication, locks, joins
Logging & locking
84
![Page 85: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/85.jpg)
NewSQL
• History: concurrent transactions were introduced since disk was slow
• Today: Now all data is in main memory
• Transactions in main memory are fast
• Less need for concurrency
• VoltDB removes concurrency
• Thus, no need for locking
85
![Page 86: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/86.jpg)
NewSQL
• Recall Bob and Sara
• Deposit $100 at same time to shared account
• Overwrite each other’s update
In VoltDB, this cannot happen
86
![Page 87: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/87.jpg)
NewSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
87
![Page 88: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/88.jpg)
NewSQL
• History: log introduced for recovery
• Today: it takes too long to recover from log
• Instead, replicate data across machines
• If one machine fails, others continue working
• Simplifies logging
88
![Page 89: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/89.jpg)
NewSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
89
![Page 90: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/90.jpg)
NewSQL
• Try to avoid joins across machines
• Store data that is commonly accessed at same time on same machine
ID name account-ID
1 Bob 1
2 Sara 1
ID balance
1 100
Customers
ID name account-ID
3 Trudy 2
ID balance
2 450
Customers
AccountsAccounts
90
![Page 91: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/91.jpg)
NewSQL
ID name account-ID
1 Bob 1
2 Sara 1
ID balance
1 100
Customers
ID name account-ID
3 Trudy 2
ID balance
2 450
Customers
AccountsAccounts
• Alleviates problem
• Does not solve it (e.g. money transfer)91
![Page 92: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/92.jpg)
NewSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
92
![Page 93: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/93.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 100
ID balance
... 100
93
![Page 94: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/94.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 100
ID balance
... 100
94
![Page 95: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/95.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 200
ID balance
... 200
95
![Page 96: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/96.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 200
ID balance
... 200
96
![Page 97: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/97.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 200
ID balance
... 200
97
![Page 98: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/98.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 220
ID balance
... 220
98
![Page 99: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/99.jpg)
NewSQL
• Tables are replicated
• Enforce operation order across replicas
DepositAdd 100
Interest Multiply by 1.1
ID balance
... 220
ID balance
... 220
99
![Page 100: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/100.jpg)
NewSQL
• Single machine bottlenecks:
• Multiple machine bottlenecks:
Replication, locks, joins
Logging & locking
100
![Page 101: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/101.jpg)
NewSQL
• When to use VoltDB?
– run at scale
–You need 100% availability
–You need ACID
101
![Page 102: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/102.jpg)
Conclusion
102
![Page 103: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/103.jpg)
Conclusion
• Hardware is cheaper
103
![Page 104: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/104.jpg)
Conclusion
Transactional Database
Data warehouses
104Row-store Column-store
![Page 105: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/105.jpg)
Conclusion
• Single machine bottlenecks:
• Multiple machines bottlenecks:
Replication, locks, joins
Logging & locking
105
![Page 106: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/106.jpg)
Conclusion
• NoSQL adapts by simplifying
– No ACID
– No joins
• NewSQL adapts by reengineering
– ACID
– Removes concurrency
– Simplifies logging
– Smart but limited partitioning across servers
106
![Page 107: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/107.jpg)
Conclusion
• Disclaimer: there is much more
– Scientific databases:
– Time series databases:
– Graph databases
• Caveat: rapid changes
• But hopefully now you have reasoning tools
107
![Page 108: CS165 Section Niv Dayan - Harvard Universitydaslab.seas.harvard.edu/.../S11_ACID_intro.pdf · NoSQL • MongoDB - Main decisions 1. No joins • Aggregate related data into “documents”](https://reader036.vdocument.in/reader036/viewer/2022071213/60418d65330d9056fb190422/html5/thumbnails/108.jpg)
Conclusion
• Thanks!
108