dr. chuckcartledgedr. chuckcartledgedr. chuckcartledge dr
TRANSCRIPT
1/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
CS-695 NoSQL DatabaseCouchDB (part 2 of 2)
Dr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck Cartledge
29 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 201529 Oct. 2015
2/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Table of contents I
1 Miscellanea
2 Assignment #05
3 DB comparisons
4 Views
5 Misc. APIs
6 Summary
7 Conclusion
8 References
3/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Corrections and additions since last lecture.
Assignment #05 is available
Mid-term (grading still inprogress)
4/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Words of explanation.
The full text is available at:http://www.cs.odu.edu/~ccartled/Teaching/2015-Fall/NoSQL/Assignments/05/
In general terms:
1 Parse data
2 Create document database(possibly two, or three)
3 Update documents based onnumber of movies filmed per year,and number of trivia facts permovie
4 Query database
5 Create list of movies per year(two different types of lists)
6 Plot graphic
7 Create a list of number of triviafacts by movie
8 Plot graphic
5/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
How different DBs compare to a RDBMS
We have some terms to compare now[3]
RDBMS K/V Columnar Doc.
DB. instance cluster cluster instancedatabase — namespace —table bucket table collectionrow key-value row documentrowid key — idcol. — col. fam. —schema — — databasejoin — — DBRef
6/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
To view or not to view, . . .
Concept: views are the primary tool used for queryingand reporting on CouchDB documents.1(1 of 2)
Permanent — are stored inside special documents called designdocuments, and can be accessed via an HTTP GET request to the URI/{dbname}/{docid}/{viewname}, where docid has the prefix design/ sothat CouchDB recognizes the document as a design document, and{viewname} has the prefix view/
Temporary — are not stored in the database, but rather executed ondemand. To execute a temporary view, you make an HTTP POSTrequest to the URI /{dbname}/ temp view, where the body of therequest contains the code of the view function and the Content-Typeheader is set to application/json.
Views rely on the MapReduce paradigm.
1https://wiki.apache.org/couchdb/Introduction_to_CouchDB_views
7/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
To view or not to view, . . .
Concept: views are the primary tool used for queryingand reporting on CouchDB documents. (2 of 2)
Image from [1].
8/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
To view or not to view, . . .
NOTE!!
Temporary views are only good during development.Final code should not rely on them as they are veryexpensive to compute each time they get called and theyget increasingly slower the more data you have in adatabase. If you think you can’t solve something in apermanent view that you can solve in an ad-hoc view,you might want to reconsider.
CouchDB Wiki Staff [4]
9/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
To view or not to view, . . .
Some gotchas
“Note that by default views are not created andupdated when a document is saved, but rather, whenthey are accessed. As a result, the first access might takesome time depending on the size of your data whileCouchDB creates the view. If preferable the views canalso be updated when a document is saved using anexternal script that calls the views when updates havebeen made. . . . Note that all views in a single designdocument get updated when one of the views in thatdesign document gets queried.”
CouchDB Wiki Staff [4]
10/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
To view or not to view, . . .
Some details about views [2]
“. . . the B-tree that backs the key-sorted view result is builtonly once, when you first query a view, and all subsequentqueries will just read the B-tree instead of executing the mapfunction for all documents again.”
When deleting a document: “. . . marks them invalid so thatthey no longer show up in view results.”
When updating a document: “. . . If a document got updated,the new document is run through the map function and theresulting new lines are inserted into the B-tree.”
11/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
To view or not to view, . . .
An example [2]:
music — database
design — constant
artists — design document
view — constant
by name — theMapReduce pair thatcreates the view
12/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Other CURL related places
Some CouchDB APIs2
Different APIs:
changes — can provide amessage when the databasechanges (polling, long polling, orcontinuous monitoring)
changes will giveeverything/anything, filter allowsyou to control changenotifications
compact — data compaction
bulk docs — upload manydocuments at once
view cleanup — cleanup old viewdata
2https://wiki.apache.org/couchdb/API_Cheatsheet
13/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Misc. other things
Things that CouchDB does but we didn’t cover
Lots of stuff:
Replication of data(peer-to-peer network)
No sharding
Resolution of conflictingupdates
Growth or contraction of thehardware suite
Load balancing
14/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Strengths and weaknesses
Good and not so good
Strengths:
Designed with unreliable networks in mind
Wide range of deployment environments(smart phones to data centers)
Almost as much an API into a databaseas a database itself
Weaknesses:
Relies on MapReduce paradigm
Ah-hoc queries shouldn’t be run onproduction systems
No sharding of data, replications are all ornothing. New nodes only increase I/O.
15/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
Applicabilities
Good for, and not so good for
Good fit;
Stands up well in uncertain environments.
Standard ReST/JSON interface makesintegration easier
Handling huge amounts of data by replication andhorizontal scaling
Very flexible data model
Ease of use (object oriented bent)
Not so good fit:
Discourages normalization
Items can be inserted anywhere (lack of schema)
May require large infrastructure
Ad-hoc queries can be time consuming(MapReduce model of execution)
16/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
What have we covered?
Reviewed assignment #05Remember Assignment #05due before next class
Next time: Neo4J
17/17
Miscellanea Assignment #05 DB comparisons Views Misc. APIs Summary Conclusion References
References I
[1] J. Chris Anderson, Jan Lehnardt, , and Noah Slater, View cookbook for sqljockeys, http://guide.couchdb.org/draft/cookbook.html, 2015.
[2] J Chris Anderson, Jan Lehnardt, and Noah Slater, Couchdb: The definitiveguide, O’Reilly Media, Inc., 2010.
[3] Eric Redmond and Jim R Wilson, Seven databases in seven weeks,Pragmatic Bookshelf, 2012.
[4] CouchDB Wiki Staff, Introduction to couchdb views,https://wiki.apache.org/couchdb/Introduction_to_CouchDB_views,2015.