03 net saturday anton samarskyy ''document oriented databases for the .net platform
DESCRIPTION
TRANSCRIPT
![Page 1: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/1.jpg)
Document-Oriented Databases for the .NET platform
Anton Samarskyy
![Page 2: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/2.jpg)
Agenda
• Challenges of Relational Databases• NoSQL: not only SQL• Document store concept• Document-oriented databases• Raven DB• Raven DB Demo• MapReduce (optional)
![Page 3: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/3.jpg)
Relational Databases properties
• ACID Atomic, Consistent, Isolated, Durable• Relational based on relation algebra & Codd’s work• Table / Row based• Rich querying capabilities• Foreign keys• Schema
![Page 4: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/4.jpg)
What do our apps need?
• Need to scale horizontally• Partition and replication• OnLine Transaction Processing and
OnLine Analytical Processing• Web 2.0• Performance, Performance, Performance• Flexibility• Big even Huge datasets
http://www.graph-database.org
![Page 5: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/5.jpg)
Not only SQL philosophy
• Being non-relational, distributed, cloud-ready
• Open-source• Horizontally scalable: easy replication
support• Schema-free• Simple API• BASE (not ACID): Basically Available, Soft
state, Eventual consistency• Huge data amount
![Page 6: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/6.jpg)
noSQL Pros
+ Cheap, easy to implement+ Removes impedance mismatch between objects and tables+ Quickly process large amounts of data+ Data modeling flexibility+ Command Query Responsibility Segregation (CQRS), Event Sourcing
![Page 7: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/7.jpg)
noSQL Cons
- New technologies- Data is generally duplicated,
potential for inconsistency- No standard language or format for
queries- Depends on application layer to
enforce data integrity- Reporting
![Page 8: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/8.jpg)
NoSQL types
Common• Wide Column
Store / Column Families
• Key Value / Tuple Store
• Document Store• Graph Databases• Object Databases
Other• Grid & Cloud
Database Solutions
• XML Databases• Multivalue
Databases• File Databases
![Page 9: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/9.jpg)
CAP
• Consistency Each client has the same view
• Availability All clients can read and write
• Partition tolerance Works well across different network partitions
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
![Page 10: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/10.jpg)
You pick only two!
![Page 11: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/11.jpg)
Who is using noSQL?
![Page 12: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/12.jpg)
Document-oriented databases are
• Collection of independent documents: XML, JSON, JAML
• Non relational, i.e. do not store data in tables with uniform sized fields for each record
• Not limited with number of fields or length • Usually accessible via a RESTful HTTP/JSON
API• Horizontally scalable• Can be distributed• Fault-tolerant
![Page 13: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/13.jpg)
Why documents store?
• Schema free• User generated content• Storing full complex object graphs• Low overhead – usually operate on a
single document:- One read, one write
• Fast• Known format means the database
can do interesting things with it…
![Page 14: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/14.jpg)
Indexing
• Order in schema free world• Materialized views• Built on the background• Allow stale reads• Don’t slow down CRUD ops
![Page 15: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/15.jpg)
Index concept
{ "name": "ayende", ”twitter": "@ayende", "projects": [ "rhino mocks", "nhibernate", "raven db", ] }
from doc in docs from prj in doc.projects select new {
Project = prj, Name = doc.Name
}
http://ayende.com/blog/4459/that-no-sql-thing-document-databases
GET /indexes/ProjectAndName?query=Project:raven
![Page 16: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/16.jpg)
Document DB family• CouchDB: Apache project created by
Damien Katz;• RavenDB: Oren Eini and Hybernating
Rhinos project;• MongoDB: 10gen project.• SimpleDB: Amazon project. It is used
as a web service in concert with Amazon Elastic Compute Cloud;
![Page 17: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/17.jpg)
Comparison
• CouchDB: Elang, REST API, JavaScript map-reduce quering (concurrent), via .NET helpers;
• MongoDB: C++, Dynamic Query (non-concurrent MapReduce), custom TCP/IP access, .NET drivers: 10gen, NoRM (Linq);
• RavenDB: .NET, REST API, Linq map to Lucene .NET + MapReduce;
• SimpleDB: Erlang, Name/Value store, basic queries, not RESTful, via .NET helpers.
![Page 18: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/18.jpg)
Raven DB
• Build on excising infrastructure (ESENT) that is known to scale to amazing sizes
• Can be transactional, i.e. ACID: supports System.Transactions and can take part in distributed transactions
• Indexes via Linq query, implements IQueryable that map to Lucene
• Supports map/reduce operations
![Page 19: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/19.jpg)
Raven DB
• Comes with fully functional .NET client API, Unit of Work, change tracking
• REST based, so you can access it via the Java Script API directly
• Support optimistic concurrency blocking
• Can be extended with MEF• Has triggering support• Supports Sharding and Replication
http://ravendb.net
![Page 20: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/20.jpg)
Raven Extensibility
• MEF (Managed Extensibility Framework)
• Triggers- PUT trigger- DELETE trigger- Read trigger- Index update triggers
• Request Responders• Custom Serialization/Deserialization
![Page 21: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/21.jpg)
Demo: RavenDB
• Setup, Server• RavenDB Client API• Denormalization, modeling
documents• CRUD• Attachments• Indexes• MapReduce indexes• Sharding
![Page 22: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/22.jpg)
MapReduce
• MapReduce is a programming model and an associated implementation for processing and generating large data sets
• Map function processes a key/value pair to generate a set of intermediate key/value pairs
• Reduce function that merges all intermediate values associated with the same intermediate key
![Page 23: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/23.jpg)
Map
![Page 24: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/24.jpg)
Sort
![Page 25: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/25.jpg)
Reduce
![Page 26: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/26.jpg)
Sharding
• Sharding refers to horizontal partitioning of data across multiple machines
• The idea is to split the load across many commodity machines, instead of buying huge expensive servers
![Page 27: 03 net saturday anton samarskyy ''document oriented databases for the .net platform](https://reader033.vdocument.in/reader033/viewer/2022061300/54c6f6674a795937038b4595/html5/thumbnails/27.jpg)
Thanks!
Questions or comments?