05 darwino db
TRANSCRIPT
Darwino DB
The Ultimate JSON Document Store
Introduction to Darwino DB
• NO-SQL, JSON based store, including semi structured data and attachments• Designed for social and collaborative applications• Designed for mobile development
– Local or remote access to the data, including replication• Designed for cloud, with built-in multi tenancy capability• Allows distributed computing with replication between instances• Databases can be deployed on the cloud• Leverage the existing relational database infrastructure
– Support very large amount of data– Scalability is provided by the underlying RDBMS
Darwino DB Main Features
₋ Transactional₋ Built-in multi-tenancy₋ Document hierarchy₋ Multi-level security
₋ Down to document level security₋ Tailored for continuous delivery
₋ Static table schema to avoid table changes
₋ Multi-points, two way replication, including external data sources
₋ Same implementation everywhere
₋ Advanced query support₋ Extensible MongoDB like query
language₋ Data aggregation₋ Joins₋ Full text search
₋ Integrated social features₋ Tagging, liking, rating
₋ Out of the box rest services₋ With Java & Javascript binding₋ OpenAPI compliant (e.g. Swagger)
The Value of RDBMS
• RDBMS is the most known and portable data store – RDBMS are available and validated by many organizations– RDBMS are available from many cloud vendors: IBM BlueMix, MS Azure…
• Many RDBMS capabilities can enhance the document store experience– Transactions, queries, joins, performance, scalability, reliability…– Standard access APIs, reporting…
• New RDBMS borrow concepts from the pure NoSQL implementation– Store and index JSON documents, spatial indexes, in memory DB
• Think that a basic JSON document store is a table with 2 columns: id & json– But other tables can add many capabilities on top of this
Darwino DB Typical Topology
Store Native API
Relational DB
REST APINative LocalWeb Application
Relational DB
Native Local Application
Store Native API
ReplicationEngine
Replication via HTTP
ReplicationEngine
Other DataServer
Server
Mobile ClientsWeb Browsers
Direct Access To the Remote Data
Darwino DB Concepts
• A Server is the physical connection to the RDBMS or REST server• A Session is the access point for a user• A Database is the main container for JSON documents• Databases are made of Stores, that are buckets of consistent documents
– The documents in the same store shares indexes, query fields, …– Some stores are predefined: _default, _local, _comments & _design
• Documents are regular JSON valid values: objects, array & literals– It is strongly advised to store objects, as some features depend on object fields
• Binary Attachments can be associated to any JSON document• Indexes are like materialized views, or mapping in a map/reduce
Database Meta-Model
• A JSON store has several options that drives the behavior of each databases– List of stores, extracted fields, indexes, full text search options
• Options are provided as a JSON object during the deployment step– The deployment step checks the version number and, if auto-deploy is activated,
acts on the data or table meta-data• Database is using 3 version numbers
– Database application version, as described by the definition– Customization version for advanced – Darwino relational table format – handled by the Darwino runtime
Simple Deployment
• The JEE ApplicationContextListener auto deploys the database– The application is not activated until the deployment is completed
• The Mobile framework also auto deploys the database
• If RDBMS DDL access is not enabled, then the DDL can be generated as a text file and sent to a DB admin
Documents
• Documents store JSON values with attachments• Documents are identified by:
– A UNID that is a string of 512c max. The UNID can be set by the developer or generated as a GUID.
• It is unique per store and traverses replication– An Id that is an integer generated by Darwino
• It is unique per database, and is different on replicas
Document CRUD Operations
1. Reading a document2. Creating a new document3. Update an existing document4. Deleting a document5. Using transactions
Running Queries - Cursor
1. Cursor objects2. Selecting entries (unid, keys, ….)3. Full Text Search4. Query language (based on MongoDB)5. Data extraction language6. Categorization – Aggregation7. Hierarchical documents
Document Binary Attachments
• A set of binary attachments can be assigned to each JSON document– Can contain any kind of data: multi-media, XML,… or even JSON!– MIME type, as well as the attachment length, is stored in the database
• Attachments get replicated with the owning JSON document– Optimized as only the changed attachments are replicated
• Attachments are stored in the RDBMS by default– But an API allows the attachments to be stored in a different place
• Duplicates can be stored once to minimize the storage need (think DAOS)• Way to support content from external stores (dropbox…)
Security
• Core security is implemented at different levels– Server ACL– Database ACL– Document reader/editor fields (Darwino mode and IBM Domino compatibility)
• REST Services security– Access to the service– Data filtering
Multi-tenancy
• Multi tenancy can be expressed at different levels– Database connection [RDBMS, Database, Schema]
• Can be implemented through a ‘bean’ delegating to the right connection– JSON store instance
• An instance id is a string identifying a tenant• An instance can provide user specific attribute role
– UserContextFactory
Document Hierarchy
• Documents can be organized in a hierarchy (documents/children)– Set by assigning a parent to a document (might not exist!)– A parent must be in the same DB, but could be in a different store
• Operations on documents can also affect the children– Store.DELETE_CHILDREN - For deleting the whole hierarchy– Store.DELETE_NOTOUCH – When actions on children affect parent documents
• Beyond parent/child relationship, Darwino also supports sync groups– Documents can be grouped to form a coherent set of docs
Social Data
• Sharing– Documents can be shared on a user basis– Efficient share count computation, per
document• Commenting
– Comments are provided as response documents, so they support tailored data models (JSON, attachments, …).
• All the social data are fully supported by the replication engine, with proper conflict resolution
• Tagging– Every document can be tagged with a series of
tags– Documents can be searched by tags– Efficient tags retrieving, including weight, for
building a tag cloud• Rating
– Documents can be rated with an integer (0-n) on a user basis
– Efficient average rating computation, per document
All the social data are fully supported by the replication engine, with proper conflict resolution
Built-in REST Services
• Ready to use REST services are available to do all the operations– Used by the Http session implementation– Only difference: as it is stateless, it does not support transactions over multiple
requests• The rest services can be called from any languages
– OpenAPI description (e.g. Swagger) is provided– Exposed through the Playground
• Custom REST services can be added to the existing ones
Business Logic
• Event handlers are triggered when an event happens– Document create, update and delete– Binary store access (reading and writing file attachments)– Security checks– Conflict handlers for synchronization
• Event handlers are implemented in the native platform languages– Java is the preferred language, with portability on J2EE servers, Android devices
and iOS devices through J2OBJC– More portability can be achieved with JavaScript using the GWT compiler (web
browsers, Node.js, Titanium Appcelerator…)
Data Replication
• Darwino uses a universal replication engine– Can replicate between Darwino DBs and Enterprise systems
Darwino DB
Universal JSON Replication
Engine
Enterprise Dataco
nnec
torconnector
Selective/functional replication
2 way data transformation
Running Replication
• Setting up a database to support replication• Predefined replication settings
– Web application, Mobile application– Highly customizable BackgroudReplicationTask
• Schedule, databases, instances…
• Using the replication APIs– Core API– Using a scheduled task
RDBC Abstraction
• Similarly to JDBC, RDBC encapsulates the access to RDBMS• Way simpler than JDBC, and tailored for Darwino DB
– Connection, Statement, ResultSet• Supports any JDBC database, and non JDBC ones (ex: SQLite)• Contains drivers and methods to abstract the database differences
– Data types, SQL generation…• Better transaction API
– Can be nested– Explicit commit: a abort() cancels the
whole transaction
c.startTransaction();try {
exec_code(c);…
c.commitTransaction();} finally {
c.endTransaction();}
A Dive Into the Relational Tables
• Darwino creates a set of tables per JSON Database• All the data is accessible using standard SQL• WARN: The Darwino security layer does not apply when directly accessing
the RDBMS through SQL. But replication can!
Thank you for your attention!