back to basics webinar 3 - thinking in documents
TRANSCRIPT
![Page 1: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/1.jpg)
![Page 2: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/2.jpg)
Code JoeD gets you a 25% discount off the list priceEarly Bird Registration Ends May 13, 2016
![Page 3: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/3.jpg)
Back to Basics 2016 : Webinar 3
Thinking in DocumentsJoe Drumgoole
Director of Developer Advocacy, EMEA@jdrumgoole
V1.1
![Page 4: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/4.jpg)
4
Review
• Webinar 1 : Introduction to NoSQL– Types of NoSQL database– MongoDB is a document database– Replica Sets and Shards
• Webinar 2– Building a basic application– Adding indexes– Using Explain to measure database operators
![Page 5: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/5.jpg)
5
Thinking in Documents
• Documents in MongoDB are Javascript Objects (JSON)• Actually they are encoded as BSON• BSON is “Binary JSON”• BSON allows efficient encoding and decoding of JSON• Required for efficient transmission and storage on disk• Eliminates the need to “text parse” all the sub objects• Full spec is online at http://bsonspec.org/
![Page 6: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/6.jpg)
6
Example Document
{ first_name: ‘Paul’, surname: ‘Miller’, cell: 447557505611, city: ‘London’, location: [45.123,47.232], Profession: [‘banking’, ‘finance’, ‘trader’], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}
Fields can contain an array of sub-documents
Fields
Typed field values
Fields can contain arrays
String
Number
Geo-Location
![Page 7: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/7.jpg)
7
Data Stores – Key Value
Key 1 Value
Key 1 Value
Key 1 Value
![Page 8: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/8.jpg)
8
Data Stores - Relational
Key 1
Value 1
Value 1
Value 1
Value 1
Key 2
Value 1
Value 1
Value 1
Value 1
Key 3
Value 1
Value 1
Value 1
Value 1
Key 4
Value 1
Value 1
Value 1
Value 1
![Page 9: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/9.jpg)
9
Data Stores - Document
Key3
Key4
Key5
Value 3
Value 5
Value 4Key6
Value 5Key7
Value 2
Value 1Key1
Key1
Key1
Key2
![Page 10: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/10.jpg)
10
In Document Form
{ “key1” : “value 1” }
{ “key1” : { “key2” : “value 1”, “key3” : { “key4” : “value 3”, “key5” : “value 4” }}
{ “key1” : { “key6” : “value 5”, “key7” : “value 6” }}
![Page 11: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/11.jpg)
11
Some Example Queries
# Will find the first two documentsdb.demo.find( { “key1” : “value” } )
# find the second document by nested valuedb.demo.find( { "key1.key3.key4" : "value 3" } )
# will find the third documentdb.demo.find( { "key1.key6" : "value 6" } )
![Page 12: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/12.jpg)
12
Modelling and Cardinality
• One to One–Title to blog post
• One to Many–Blog post to comments
• One to Millions–Blog post to site views (e.g. Huffington Post)
![Page 13: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/13.jpg)
13
One To One
{ “Title” : “This is a blog post”, “Body” : “This is the body text of a very short blog post”, …}
We can index on “Title” and “Body”.
![Page 14: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/14.jpg)
14
One to Many
{ “Title” : “This is a blog post”, “Body” : “This is the body text”, “Comments” : [ { “name” : “Joe Drumgoole”, “email” : “[email protected]”, “comment” : “I love your writing style” }, { “name” : “John Smith”, “email” : “[email protected]”, “comment” : “I hate your writing style” }]}
Where we expect a small number of comments we can embed them in the main document
![Page 15: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/15.jpg)
15
Key Concerns
• What are the write patterns?– Comments are added more frequently than posts– Comments may have images, tags, large bodies of text
• What are the read patterns?– Comments may not be displayed– May be shown in their own window– People rarely look at all the comments
![Page 16: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/16.jpg)
16
Approach 2 – Separate Collection
• Keep all comments in a separate comments collection• Add references to comments as an array of comment IDs• Requires two queries to display blog post and associated comments• Requires two writes to create a comments
{ _id : ObjectID( “AAAA” ), name : “Joe Drumgoole”, email : “[email protected]”, comment :“I love your writing style”,}{ _id : ObjectID( “AAAB” ), name : “John Smith”, email : “[email protected]”, comment :“I hate your writing style”,}
{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”, “comments” : [ ObjectID( “AAAA” ), ObjectID( “AAAB” )]}{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”, “comments” : []}
![Page 17: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/17.jpg)
17
Approach 3 – A Hybrid Approach
{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”, “comments” : [{ “_id” : ObjectID( “AAAA” ) “name” : “Joe Drumgoole”, “email” : “[email protected]”,
comment :“I love your writing style”,}{ _id : ObjectID( “AAAB” ), name : “John Smith”, email : “[email protected]”, comment :“I hate your writing style”,}]
}
{ “_post_jd” : ObjectID( “ZZZZ” ), “comments” : [{ “_id” : ObjectID( “AAAA” ) “name” : “Joe Drumgoole”, “email” : “[email protected]”,
“comment” :“I love your writing style”,}{...},{...},{...},{...},{...},{...},{..},{...},{...},{...} ]
![Page 18: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/18.jpg)
18
What About One to A Million
• What is we were tracking mouse position for heat tracking?– Each user will generate hundreds of data points per visit– Thousands of data points per post– Millions of data points per blog site
• Reverse the model– Store a blog ID per event
{ “post_id” : ObjectID(“ZZZZ”), “timestamp” : ISODate("2005-01-02T00:00:00Z”), “location” : [24, 34] “click” : False,}
![Page 19: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/19.jpg)
19
But – Finite number of events per second
{ post_id : ObjectID ( “ZZZZ” ), timeStamp: ISODate("2005-01-02T00:00:00Z”), events : { 0 : { 0 : { <Info> }, 1 : { <Info> }, … 99: { <Info> }}, 1 : { 0 : { <Info> }, 1 : { <Info> }, … 99: { <Info> }}, 2 : { 0 : { <Info> }, 1 : { <Info> }, … 99: { <Info> }}, 3 : { 0 : { <Info> }, 1 : { <Info> }, … 99: { <Info> }}, ... 59 :{ 0 : { <Info> }, 1 : { <Info> }, … 99: { <Info> }}}
![Page 20: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/20.jpg)
20
Guidelines
• Embed objects for one to one capabilities• Look at read and write patterns to determine when to break out data• Don’t get stuck in “one record” per item thinking• Embrace the hierarchy• Think about cardinality• Grow your data by adding documents not be increasing document size• Think about your indexes• Document updates are transactions
![Page 21: Back to Basics Webinar 3 - Thinking in Documents](https://reader031.vdocument.in/reader031/viewer/2022030307/58e550901a28ab3a468b653f/html5/thumbnails/21.jpg)