(part 3) database programming

64
Database Programming (Part 3) Copyright © 2021 by Robert M. Dondero, Ph.D. Princeton University 1

Upload: others

Post on 05-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (Part 3) Database Programming

Database Programming(Part 3)

Copyright © 2021 byRobert M. Dondero, Ph.D.

Princeton University

1

Page 2: (Part 3) Database Programming

Objectives

Continued from Part 2…

[see slide]

Objectives

• We will cover:– Databases (DBs) and database

management systems (DBMSs)…– With a focus on relational DBs and DBMSs…– With a focus on the SQLite DBMS…– With a focus on programming with SQLite

2

Page 3: (Part 3) Database Programming

Agenda

• Relational DB transactions: atomicity• Relational DB transactions: locking• Relational DB design

3

Page 4: (Part 3) Database Programming

DB Transactions: Atomicity

• Problem: DBMS must preserve DB consistency in the presence of HW/SW failures

4

Page 5: (Part 3) Database Programming

DB Transactions: Atomicity

• Example:– Customer 222 has purchased 1 copy of book

123– Must change ORDERS:

• Add 1 to quantity of appropriate row (20 -> 21)– Must change BOOKS:

• Subtract 1 from quantity of appropriate row (500 -> 499)

5

Page 6: (Part 3) Database Programming

One logical update consists of 2 UPDATE stmtsFailure occurs between execution of stmts

=> DB is corrupted

UPDATE orders SET quantity=quantity+1 WHERE isbn=123 AND custid=222;

Time HW/SW failure

DB Transactions: Atomicity

6

UPDATE books SET quantity=quantity-1 WHERE isbn=123;

Page 7: (Part 3) Database Programming

DB Transactions: Atomicity

• Solution: – DBMS must execute each logical DB update

atomically

• Atomically: either completely, or not at all

7

Page 8: (Part 3) Database Programming

DB Transactions: Atomicity

• Problem: How to implement atomicity?• Solution: Transactions

• Transaction– Defines an atomic unit of work– DBMS executes a transaction either

completely or not at all

8

Page 9: (Part 3) Database Programming

DB Transactions: Atomicity

• See purchase.py– The job:

• Accept isbn and custid as command-line arguments

• In ORDERS table increment quantity of row with given isbn and custid

• In BOOKS table decrement quantity of row with given isbn

– Note:• Two updates must be executed atomically

9

Page 10: (Part 3) Database Programming

DB Transactions: Atomicity

• See purchase.py (cont.)

10

Relational DB Transactions: Atomicity

[see slide]

Code notes:First update implicitly begins a transactionconnection.commit()

Commits and ends transactionconnection.rollback()

Rolls back and ends transaction

Page 11: (Part 3) Database Programming

DB Transactions: Atomicity

• Do SQLite transactions in Python really handle HW/SW errors?

• Do they really enforce atomicity?

11

Page 12: (Part 3) Database Programming

DB Transactions: Atomicity

• See recovery.py– The job (artificial):

• Repeatedly, 20 times…• In orders table increment quantity of row with isbn 123 and custid 222

• In books table decrement quantity of row with isbn 123

• “Failure” may occur between the updates• Is DB consistency preserved?

12

Page 13: (Part 3) Database Programming

Relational DB Transactions: Atomicity

Code notes:

“HW/SW failure” occurs at pseudo-random times between updatesIs DB consistency preserved?

[see slide]

500 + 20 == 483 + 37Provides evidence that DB consistency is preserved!SQLite transactions are atomic

DB Transactions: Atomicity

• See recovery.py (cont.)

13

Page 14: (Part 3) Database Programming

Agenda

• Relational DB transactions: atomicity• Relational DB transactions: locking• Relational DB design

14

Page 15: (Part 3) Database Programming

DB Transactions: Locking

• Problem: DBMS must preserve DB consistency in the presence of concurrent updates

15

Page 16: (Part 3) Database Programming

Increment ORDERS (21)Time

Process P1 Process P2

BEGIN

Decrement BOOKS (499)

DB Transactions: Locking

16

COMMIT

Increment ORDERS (21)

Decrement BOOKS (499)

One increment/decrement pair is lost!!!

Without locking:

Relational DB Transactions: Locking

Without locking, consider this sequence

[see slide]

P1 starts a transaction

P1 increments the quantity in ORDERS, thus changing it to 21But the increment occurs within a transaction, and so is cached

P2 increments the quantity in ORDERS, thus changing it to 21

P1 decrements the quantity in BOOKS, thus changing it to 499But the decrement occurs within a transaction, and so is cached

P2 decrements the quantity in BOOKS, thus changing it to 499

P1 ends/commits the transactionThe quantity in ORDERS is changed to 21The quantity in BOOKS is changed to 499

One increment/decrement pair is lost!

Page 17: (Part 3) Database Programming

DB Transactions: Locking

• Solution: Locking within transactions– Process P1 asks DBMS to execute

transaction involving update of row X• DBMS locks row X

– Concurrently, process P2 asks DBMS to update row X

• DBMS forces P2 to wait, or rejects P2’s request, until transaction completes

17

Page 18: (Part 3) Database Programming

Increment orders (21)

Time

Process P1 Process P2

BEGIN

Decrement books (499)

DB Transactions: Locking

18

COMMITIncrement orders (22)

Decrement books (498)

Process P2 updates are postponed or rejected

Updates disallowed

Relational DB Transactions: Locking

With locking, consider this similar sequence

[see slide]

P1 starts a transaction

During the transaction, P2’s attempts to update are disallowed (postponed or rejected)

P1 increments the quantity in ORDERS, thus changing it to 21But the increment occurs within a transaction, and so is cached

P1 decrements the quantity in BOOKS, thus changing it to 499But the decrement occurs within a transaction, and so is cached

P1 ends/commits the transactionThe quantity in ORDERS is changed to 21The quantity in BOOKS is changed to 499

P2 increments the quantity in ORDERS, thus changing it to 22

P2 decrements the quantity in BOOKS, thus changing it to 498

Page 19: (Part 3) Database Programming

No increments/decrements are lost!

Page 20: (Part 3) Database Programming

Relational DB Transactions: Locking

Important note:

Full-featured DBMSs (PostgreSQL, Oracle, SQLServer, MySQL, …), do row-level locking

SQLite does DB-level locking

A big reason why SQLite isn’t used in production applications

More precisely

SQLite often is used in production applications as a rich alternative to simple flat filesThunderbird email client uses a SQLite database

SQLite is not used in production applications as a weak alternative to a full-fledged DBMS (PostgreSQL, Oracle, etc.)

DB Transactions: Locking

• Important note:– PostgreSQL, Oracle, SQLServer, MySQL,...

• Row-level locking– SQLite

• DB-level locking

19

Page 21: (Part 3) Database Programming

DB Transactions: Locking

$ sqlite3 bookstore.sqlitesqlite> BEGIN;sqlite> UPDATE books SET quantity = 11111 WHERE isbn = 123;sqlite>

$ python purchase.py 123 222database is locked$

20

Window 1:

Window 2:

Noticeable delay

Page 22: (Part 3) Database Programming

DB Transactions: Locking

$ sqlite3 bookstore.sqlitesqlite> BEGIN;sqlite> UPDATE books SET quantity = 11111 WHERE isbn = 123;sqlite> COMMIT;sqlite>

$ python purchase.py 123 222database is locked$ python purchase.py 123 222Transaction committed.$

21

Window 1 (cont):

Window 2 (cont):

Page 23: (Part 3) Database Programming

Transaction Summary

• DBMSs use transactions to:– Recover from HW/SW errors

• Transactions implement atomicity– Handle concurrent updates

• Transactions implement locking

22

Page 24: (Part 3) Database Programming

Relational DB Design

So far:How to use a relational DB

Now:How to design a relational DB

Agenda

• Relational DB transactions: atomicity• Relational DB transactions: locking• Relational DB design

23

Page 25: (Part 3) Database Programming

Relational DB Design

• Relational DB normal forms– Rules that DBAs can apply to:

• Determine if DB design is wrong• Make it right

24

Relational DB Design

To design a proper relational DB, you must know about the relational DB normal forms

Developed by Codd

Rules that DBAs can apply to:Determine if DB design is wrongMake it right

There are many of themWe’ll cover the first 3…

Page 26: (Part 3) Database Programming

Relational DB Design: DB0

Let’s consider a series of “bookstore” relational databasesWe’ll start with a very flawed versionWe’ll incrementally apply normal formsThe final version in the series will be the database that we used in previous lectures

Consider DB0…

BOOKS:There is a book whose isbn is 123Whose title is “The Practice of Programming” Whose authors are “Kernighan” and “Pike”And there are 500 copies of it available in our warehouse

ORDERS:There is a customer whose custid is 222That customer is the Harvard bookstoreIt’s located at 1256 Mass Ave in Cambridge, MA 02138That customer ordered 20 copies of the book with isbn 123

BOOKS isbn title authors quantity123 The Practice of Programming {Kernighan, Pike} 500234 The C Programming Language {Kernighan, Ritchie} 800345 Algorithms in C {Sedgewick} 650

ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30

Relational DB Design: DB0

25

DB0:

Page 27: (Part 3) Database Programming

Relational DB Design: DB0

• Def: A table is in first normal form iff all underlying domains contain atomic values only– All modern relational DBMSs enforce first

normal form

26

Page 28: (Part 3) Database Programming

Relational DB Design: DB0

Recall: A table is in first normal form iff all underlying domains contain atomic values only

DB0 has cells that contain non-atomic values

ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30

Relational DB Design: DB0

27

BOOKS isbn title authors quantity123 The Practice of Programming {Kernighan, Pike} 500234 The C Programming Language {Kernighan, Ritchie} 800345 Algorithms in C {Sedgewick} 650

DB0 is not in first normal form

DB0:

Page 29: (Part 3) Database Programming

BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650

AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick

ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30

DB1:

Relational DB Design: DB1

28

Relational DB Design: DB1

The solution: split BOOKS into BOOKS and AUTHORS

Yields DB1

[see slide]

DB1 is in first normal form

Page 30: (Part 3) Database Programming

BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650

AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick

ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30

DB1:

Relational DB Design: DB1

29DB1 design seems wrong

Relational DB Design: DB1

[see slide]

Intuitively: Still not quite right

The location of the Harvard bookstore is stored redundantlySuppose the Harvard bookstore moves?

Page 31: (Part 3) Database Programming

Relational DB Design: DB1

• Somewhat informally...• Def: The primary key for a table is the

minimal set of columns that uniquely identifies any particular row of that table

30

Page 32: (Part 3) Database Programming

BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650

AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick

ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30

Primary keys in DB1:

Relational DB Design: DB1

31

Relational DB Design: DB1

[see slide]

In BOOKS the primary key is isbn

In ORDERS the primary key is isbn+custid

In AUTHORS the primary key is isbn+authorDegenerate case

Page 33: (Part 3) Database Programming

Relational DB Design: DB1

• Def: A column C2 of a table is functionally dependent on a column C1 iff, for each row in the table, the value of C1 determines the value of C2

• In DB1...

32

Page 34: (Part 3) Database Programming

AUTHORS

ORDERS

BOOKS

isbn

Relational DB Design: DB1

33

title

quantity authorisbn

quantity

custname

street

zipcode

city

statecustid

isbn

Functional dependencies in DB1:

Relational DB Design: DB1

[see slide]

BOOKS:title and quantity are functionally dependent upon isbn

isbn uniquely identifies a rowIf we know the isbn, then we can determine the title and quantity

AUTHORS: degenerate case

ORDERS:Quantity is functionally dependent upon isbn+custidCustname, street, and zipcode are functionally dependent upon custidCity and state are functionally dependent upon zipcode

Page 35: (Part 3) Database Programming

Relational DB Design: DB1

• Somewhat informally…• A table is in second normal form iff:

– It is in first normal form, and– No non-primary-key column is dependent on

any proper subset of the primary key

34

Page 36: (Part 3) Database Programming

Relational DB Design: DB1

Recall: A table is in second normal form iff:It is in first normal form, andNo non-primary-key column is dependent on any proper subset of the primary key

In ORDERS:Custname, street, and zipcode are functionally dependent upon a proper subset of the isbn+custid primary key

AUTHORS

ORDERS

BOOKS

isbn

Relational DB Design: DB1

35

title

quantity authorisbn

quantity

custname

street

zipcode

city

statecustid

isbn

DB1 is not in second normal form

Functional dependencies in DB1:

Page 37: (Part 3) Database Programming

Relational DB Design: DB2

Recall: A table is in second normal form iff:It is in first normal form, andNo non-primary-key column is dependent on any proper subset of the primary key

Solution: Split ORDERS into ORDERS and CUSTOMERS

Yields DB2

[see slide]

BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650

AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick

CUSTOMERScustid custname street city state zipcode111 Princeton 114 Nassau St Princeton NJ 08540 222 Harvard 1256 Mass Ave Cambridge MA 02138333 MIT 292 Main St Cambridge MA 02142

ORDERSisbn custid quantity123 222 20345 222 100123 111 30

Relational DB Design: DB2

36

DB2:

Page 38: (Part 3) Database Programming

Relational DB Design: DB2

Recall: A table is in second normal form iff:It is in first normal form, andNo non-primary-key column is dependent on any proper subset of any primary key

Note that DB2 is in second normal form

AUTHORS

ORDERS

BOOKS

isbn

Relational DB Design: DB2

37

title

quantity authorisbn

quantitycustname

street

zipcode

city

state

custid

isbn

custid

CUSTOMERS

Functional dependencies in DB2:

DB2 is in second normal form

Page 39: (Part 3) Database Programming

Relational DB Design: DB2

[see slide]

Intuitively: Still not quite right.The fact that zip code 08540 maps to Princeton, NJ could be stored multiple times

CUSTOMERScustid custname street city state zipcode111 Princeton 114 Nassau St Princeton NJ 08540 222 Harvard 1256 Mass Ave Cambridge MA 02138333 MIT 292 Main St Cambridge MA 02142

Relational DB Design: DB2

38

BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650

AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 SedgewickORDERS

isbn custid quantity123 222 20345 222 100123 111 30

DB2:

Design of DB2 seems wrong

Page 40: (Part 3) Database Programming

Relational DB Design: DB2

• Somewhat informally...• A table is in third normal form iff:

– It is in second normal form, and– Every non-primary-key column is

non-transitively dependent on the primary key

39

Page 41: (Part 3) Database Programming

Relational DB Design :DB2

Recall: A table is in third normal form iff:It is in second normal form, andEvery non-candidate-key column is non-transitively dependent on the candidate keys

City and state are functionally dependent on zipcodeZipcode is functionally dependent on custidSo city and state are transitively functionally dependent upon custid

AUTHORS

ORDERS

BOOKS

isbn

Relational DB Design: DB2

40

title

quantity authorisbn

quantitycustname

street

zipcode

city

state

custid

isbn

custid

CUSTOMERS

Functional dependencies in DB2:

DB2 is not in third normal form

Page 42: (Part 3) Database Programming

Relational DB Design: DB3

Recall: A table is in third normal form iff:It is in second normal form, andEvery non-candidate-key column is non-transitively dependent on the candidate keys

The solution: Split CUSTOMERS into CUSTOMERS and ZIPCODES

[see slide]

Yields DB3

BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650

AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick

CUSTOMERScustid custname street zipcode111 Princeton 114 Nassau St 08540 222 Harvard 1256 Mass Ave 02138333 MIT 292 Main St 02142

ORDERSisbn custid quantity123 222 20345 222 100123 111 30

ZIPCODESzipcode city state08540 Princeton NJ02138 Cambridge MA02142 Cambridge MA

Relational DB Design: DB3

41

DB3:

Page 43: (Part 3) Database Programming

Relational DB Design: DB3

Recall: A table is in third normal form iff:It is in second normal form, andEvery non-candidate-key column is non-transitively dependent on the candidate keys

[see slide]

Note that DB3 is in third normal form

We’ve now incrementally designed the DB that we used in previous lectures

AUTHORSBOOKS

isbn

Relational DB Design: DB3

42

title

quantity

authorisbn

custname

street

zipcodecity

state

custid

CUSTOMERS

ORDERS

quantitycustid

isbn

zipcode

ZIPCODES

Functional dependenciesin DB3:

DB3 is in third normal form

Page 44: (Part 3) Database Programming

Relational DB Design

• Some additional points...– There are more normal forms– Database designers routinely violate normal

forms– DBMS can enforce additional consistency

constraints– There is a substantial mathematical theory of

relational database design

44

Relational DB Design

Some additional points to wrap up…

There are more normal formsSee Date book

Database designers routinely violate normal formsBut a good one does so:

Only with a purposeKnowing the consequences

DBMS can enforce additional consistency constraints; e.g.:Primary key values cannot be NULLPrimary key values must be unique within a tableWithin a table, foreign key values must correspond to primary key values in another table

E.g. orders.isbn is a foreign key with respect to books.isbn

There is a substantial mathematical theory of relational database designMore precise definitions of normal formsSeveral additional normal formsRelational algebra, relational calculusForeign keys, integrity rules

Page 45: (Part 3) Database Programming

...See An Introduction to Database Systems (C. J. Date)

Page 46: (Part 3) Database Programming

Summary

In Database Programming (Part 3) we have covered:

[see slide]

Summary

• We have covered:– Relational DB transactions: atomicity– Relational DB transactions: locking– Relational DB design

44

Page 47: (Part 3) Database Programming

Summary

In the 3 Database Programming lectures we have covered…

[see slide]

Summary

• We have covered:– Databases (DBs) and database

management systems (DBMSs)…– With a focus on relational DBs and

DBMSs…– With a focus on the SQLite DBMS…– With a focus on programming with SQLite

• See also...

45

Page 48: (Part 3) Database Programming

See Also

• Appendices– Appendix 1: Before relational DBs– Appendix 2: After relational DBs

• Optional lecture slide decks– PostgreSQL– SQLAlchemy– MongoDB

See Also

AppendicesBefore relational DBs

A look backward through time at the DB fieldAfter relational DBs

An overview of some more recent developments in the DB field

Optional database lecture slide decks

Available via Topics web pageCover some popular database-related technologies that are:

Not used in COS 333 assignmentsOften used in COS 333 projects

PostgreSQLA full-fledged relational DBMSIn contrast with SQLite…

Runs as a serverProvides authenticationRow-level lockingPreferred by the Heroku web service

Good choice for many COS 333 projects

Page 49: (Part 3) Database Programming

SQLAlchemyAn object-relational mapper (ORM)Allows you to use a relational DBMS without using SQLMaps relational DB tables to classes, and rows to objects of that classUsed by many COS 333 projects

MongoDBA popular non-relational (noSQL) DBMSUsed by some COS 333 projects

GoalGive you enough information about those technologies to:

Help you decide if you want to use them in your projectHelp you get started with them

SuggestionsView quickly now

To understand the general conceptsSo you know what material is given, in case you need it later

View thoroughly laterAs needed

Page 50: (Part 3) Database Programming

Appendix 1:Before Relational DBs

Page 51: (Part 3) Database Programming

Before Relational DBs

• Before relational DBs, there were…

• Navigational DBs– Data are linked into graph structure

48

Before Relational DBs

Before relational DBs, there were…

Navigational DBsData are linked into graph structureClient is given “root” nodeClient follows links to desired data

Example…

Page 52: (Part 3) Database Programming

Before Relational DBs

[see slide]

The root of the graph points to 0 or more BOOK recordsEach BOOK record contains an isbn, a title, and a quantityEach BOOK record points to 0 or more ORDER recordsEach ORDER record contains a customerid and a quantityEach ORDER record points to one CUSTOMER record Each CUSTOMER record contains a custname, street, and zipcodeEach CUSTOMER record points to one ZIPCODE record Each ZIPCODE record contains a city and stateEach BOOK record points to one or more AUTHOR nodeEach AUTHOR record contains an author (name)

isbn title quantity

author

custname street zipcode

city state

custid quantity

BOOK

ORDER AUTHOR

CUSTOMER

ZIPCODE

Before Relational DBs

49

root

*

* +

1

1

Example Navigational DB

Page 53: (Part 3) Database Programming

Princeton | 114 Nassau St | 08540

111 | 30222 | 20ORDER(custid, quantity)

Harvard | 1256 Mass Ave | 02138

CUSTOMER(custname, street,zipcode)

Which customers purchased the book whose ISBN is 123?

123 | The Practice of Programming | 500BOOK(isbn, title, quantity)

Before Relational DBs

50

root

Before Relational DBs

[see slide]

Given a book, it’s easy to find the customers who purchased that book

However, given a customer, it would be hard to find the books purchased by that customer

Page 54: (Part 3) Database Programming

Before Relational DBs

• Navigational DBs– Queries are biased– DB designer must anticipate queries

• Relational DBs– Queries are unbiased– DB designer need not anticipate queries– However, can create indices

51

Before Relational DBs

Navigational DBsQueries are biasedDB designer must anticipate queries to create appropriate links

Relational DBsQueries are unbiased

That was Codd’s original motivationDB designer need not anticipate queries

However, can create indicesDBA can create indices, based upon anticipated usage patternsDBMS can create indices, based upon observed usage patterns

Page 55: (Part 3) Database Programming

Appendix 2:After Relational DBs

Page 56: (Part 3) Database Programming

After Relational DBs

• For some apps:– Relational DBMSs are more complex than

necessary– The relational DB model is a poor fit

53

After Relational DBs

Observations:For some apps, relational DBMSs are more complex than necessary

Many apps don’t need query neutrality, transactions, …For some apps, the relational DB model (tables, rows) is a poor fit

Example: apps that use data that is hierarchical to undefined levels

It’s not the case that one size fits all

One size does not fit some

Page 57: (Part 3) Database Programming

After Relational DBs

Data Model

Relational Non-Relational(NoSQL)

Key-ValueStores

DocumentStores

Wide-Column Stores

54

After Relational DBs

Those observations motivate the non-relational (“NoSQL”) data models

Note that there are three kinds…

Page 58: (Part 3) Database Programming

After Relational DBs

Data Model

Relational Non-Relational(NoSQL)

Key-ValueStores

DocumentStores

Wide-Column Stores

55

Page 59: (Part 3) Database Programming

After Relational DBs

• Key-value store– Values: arbitrary bytes– Data structure: key-value pairs– Access: by key– Examples: Redis, Memcached, Microsoft

Azure Cosmos DB, Hazelcast, Ehcache

56

After Relational DBs

Key-value storeValues: Arbitrary bytesData structure: key-value pairsAccess: by keyExamples: Redis, Memcached, Microsoft Azure Cosmos DB, Hazelcast, EhcacheValue Stores

Essentially a key-value store database is a persistent associative array (hash table or search tree)Lookup is strictly by key

Very simple; very fast

Page 60: (Part 3) Database Programming

After Relational DBs

Data Model

Relational Non-Relational(NoSQL)

Key-ValueStores

DocumentStores

Wide-Column Stores

57

Page 61: (Part 3) Database Programming

After Relational DBs

• Document store– Values: documents with internal structure

(e.g., JSON)– Data structure: key-value pairs– Access: by key or content– Examples: MongoDB, Amazon DynamoDB,

Couchbase, CouchDB, MarkLogic

58

After Relational DBs

Document storeValues: Documents with internal structure (e.g. JSON) which DBMS can processData structure: key-value pairs

Can access document by key or sometimes by contentAccess: by key or contentExamples:

MongoDB, Amazon DynamoDB, Couchbase, CouchDB, MarkLogic

Page 62: (Part 3) Database Programming

After Relational DBs

Data Model

Relational Non-Relational(NoSQL)

Key-ValueStores

DocumentStores

Wide-Column Stores

59

Page 63: (Part 3) Database Programming

After Relational DBs

• Wide-column store– Values: Arbitrary bytes– Data structure: Multidimensional associative

array– Examples: Cassandra, HBase, Microsoft

Azure Cosmos DB

60

After Relational DBs

Wide-column storeValues: Arbitrary bytesData structure: Multidimensional associative array

Typically sparsely populatedExamples:

Cassandra, HBase, Microsoft Azure Cosmos DB

I haven’t used

Origin: Google BigTable

Page 64: (Part 3) Database Programming

After Relational DBs

[see slide]

NotesRelational data model still dominates, by a wide marginPostgreSQL is the preferred DBMS when deploying on HerokuSQLite is in the top 10!

After Relational DBs

Rank DBMS DB Data Model Score1 Oracle Relational 12632 MySQL Relational 12283 Microsoft SQL Server Relational 9824 PostgreSQL Relational 5775 MongoDB Document Store 4966 Redis Key-Value Store 1687 DB2 Relational 1659 SQLite Relational 13010 Cassandra Wide-Column Store 114

Popular DBMSs, according to https://db-engines.com/en/ranking as of July 2021:

61