CP344 – Databases
Open Notes Chapter 10:Document-Based Databases
Human embryos can be grown in lab for longer than 14 days
Tech News!
Medical error third leading cause of death
Human embryos can be grown in lab for longer than 14 days
Tech News!
Hackers' Tip of the Day:Create indexes in MySQL
CREATE INDEX color_index on Jeep(color);
Table/Indexing Updates?
Main Weakness of Key/Value Stores
● Data only has a manually defined structure
Main Weakness of Key/Value Stores
● Data only has a manually defined structure
● Redis stores data structures (sets, lists, etc)● These data structures cannot easily be queried
Document Store
Key Value
id1 doc1
id2 doc2
id3 doc3
id4 doc4
id5 doc5
id6 doc6
Document Store
Key Value
id1 doc1
id2 doc2
id3 doc3
id4 doc4
id5 doc5
id6 doc6
What's a document?
XML Document
XML DocumentHuman and machine readable.
XML DocumentHuman and machine
readable.
Built-in schema.
XML DocumentHuman and machine readable.
Built-in schema.
User-defined tags.
JSON Document
JSON DocumentHuman and machine
readable.
JSON DocumentHuman and machine
readable.
Flexible schema.
JSON DocumentHuman and machine readable.
Flexible schema.
Subset of Javascript.
YAML Document
YAML DocumentHuman and machine
readable.
YAML DocumentHuman and machine readable.
Easy to read with whitespace.
RESTful APIs(Representational State Transfer)
http://www.dog.com/search?q=”dog”
Normal HTTP GET request
RESTful APIs(Representational State Transfer)
http://www.dog.com/search?q=”dog”
{“dogName”: “Mr. Paws”,“breed”: “Golden-pointer”,“favBed”: “Paw Palace”,“favPastime”: “Barking”
}
JSON
Normal HTTP GET request
Exercise: Access Freebase
https://www.googleapis.com/freebase/v1/search?query=dog
Example request:
import jsonimport urllib2
Download text from a queryParse text using json libraryPrint out result number 3
Python pseudocode:
BSON Documents
● Binary JSON
● Values are stored in binary instead of plain text● Save space● Faster to read● Not human-readable
Document Stores
● Pros● Documents have built-in schema● Fast key/value lookup● Easy to split up across machines
Document Stores
● Pros● Documents have built-in schema● Fast key/value lookup● Easy to split up across machines
● Cons● Code must keep track of schema of each doc.● No overall database structure● Some queries are hard to write
Document Stores
● MongoDB
● CouchDB
● Terrastore
MongoDB Examples
Exercise: Insert Freebase Articles to MongoDB
Download JSON from freebase
Insert each result as a separate MongoDB document
Run an example MongoDB query that searches documents based on one attribute
Pseudocode:
import pymongo
conn = pymongo.Connection('localhost', 27017)db = conn['test_database']coll = db['test_collection']
doc = {"Name":"Benny", "Password":"Pancake"}docID1 = coll.insert(doc)
pymongo example:
Final Project