Download - Schema Design
Schema Design
Software Engineer, MongoDB
Craig Wilson
#MongoDBDays
@craiggwilson
All application development isSchema Design
Success comes from aProper Data Structure
Terminology
RDBMS MongoDB
Database ➜ Database
Table ➜ Collection
Row ➜ Document
Index ➜ Index
Join ➜ Embedding & Linking
Working with Documents
{ _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { _id: "kchodorow", name: "Kristina Chodorow“ }, { _id: "mdirold", name: “Mike Dirolf“ } ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}
What is a Document?
Traditional Schema DesignFocus on Data Storage
Document Schema Design
Focus on Data Usage
Traditional Schema DesignWhat answers do I have?
Document Schema Design
What questions do I have?
Schema Design By Example
Library Management Application
• Patrons/Users
• Books
• Authors
• Publishers
Question:What is a Patron’s Address?
> patron = db.patrons.find({ _id : “joe” }){ _id: "joe“, name: "Joe Bookreader”}
> address = db.addresses.find({ _id : “joe” }){ _id: "joe“, street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}
A Patron and their Address
> patron = db.patrons.find({ _id : “joe” }){ _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 }}
A Patron and their Address
One-to-One Relationships
• “Belongs to” relationships are often embedded.
• Holistic representation of entities with their embedded attributes and relationships.
• Optimized for read performance
Question:What are a Patron’s Addresses?
> patron = db.patrons.find({ _id : “bob” }){ _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ]}
A Patron and their Addresses
> patron = db.patrons.find({ _id : “bob” }){ _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ]}
> patron = db.patrons.find({ _id : “joe” }){ _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", …}}
A Patron and their Addresses
Migration Possibilities
• Migrate all documents when the schema changes.
• Migrate On-Demand– As we pull up a patron’s document, we make the
change.– Any patrons that never come into the library never
get updated.
• Leave it alone– As long as the application knows about both
types…
Question:Who is the publisher of this book?
Book
MongoDB: The Definitive Guide,
By Kristina Chodorow and Mike Dirolf
Published: 9/24/2010
Pages: 216
Language: English
Publisher: O’Reilly Media, CA
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}
Book with embedded Publisher
Book with embedded Publisher
• Optimized for read performance of Books
• Other queries become difficult
Question:Who are all the publishers in the system?
> publishers = db.publishers.find(){ _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA"}{ _id: “penguin”, name: “Penguin”, founded: “1983”, location: “CA”}
All Publishers
> book = db.books.find({ _id: “123” }){ _id: “123”, publisher_id: “oreilly”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}
> db.publishers.find({ _id : book.publisher_id }){ _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA"}
Book with linked Publisher
Question:What are all the books a publisher has published?
> publisher = db.publishers.find({ _id : “oreilly” }){ _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA“, books: [“123”,…]}
> books = db.books.find({ _id: { $in : publisher.books } })
Publisher with linked Books
Question:Who are the authors of a given book?
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [“kchodorow”, “mdirolf”]}
> authors = db.authors.find({ _id : { $in : book.authors } })
{ _id: "kchodorow", name: "Kristina Chodorow”, hometown: … }{ _id: “mdirolf", name: “Mike Dirolf“, hometown: … }
Books with linked Authors
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ]}
Books with linked Authors
Question:What are all the books an author has written?
> authors = db.authors.find({ _id : “kchodorow” }){ _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ]}
Authors with linked Books
> authors = db.authors.find({ _id : “kchodorow” }){ _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ]}
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ]}
Links on both Authors and Books
Linking vs. Embedding
• Embedding– Great for read performance– Writes can be slow– Data integrity needs to be managed
• Linking– Flexible– Data integrity is built-in– Work is done during reads
Question:What are all the books about databases?
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", category: “MongoDB”}
> categories = db.categories.find({ _id: “MongoDB” }){ _id: “MongoDB”, parent: “Databases”}
Categories as Documents
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", categories: [“MongoDB”, “Databases”, “Programming”]}
> db.books.find({ categories: “Databases” })
Categories as an Array
> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", category: “Programming/Databases/MongoDB”}
> db.books.find({ category: ^Programming/Databases/* })
Categories as a Path
Conclusion
• Schema design is different in MongoDB
• Basic data design principals stay the same
• Focus on how an application accesses/manipulates data
• Evolve the schema to meet requirements as they change
Schema Design
Software Engineer, 10gen
Craig Wilson
#MongoDBDays
@craiggwilson