scalable xquery processing with zorba on top of mongodb

34
Scalable XQuery Processing Zorba Meets MongoDB William Candillon { [email protected] } msec

Upload: william-candillon

Post on 12-May-2015

6.260 views

Category:

Technology


4 download

DESCRIPTION

Since a couple of years, the NoSQL movement has developed a variety of open-source document stores. Most of them focus on high availability, horizontal scalability, and are designed to run on commodity hardware. These products have gained great traction in the industry to store large amounts of flexible data (mostly JSON). In the meantime, XQuery has evolved to a standardized, full-fledged programming language for XML with native support for complex queries, indexes, updates, full-text search, and scripting. Moreover, JSON has recently been added as a first-level datatype into the language. As of today, it is without doubt the most robust and productive technology to process flexible data. The aim of this talk is to showcase the benefits that can be achieved by integrating the Zorba XQuery Processor with MongoDB. We will introduce the 28msec platform that seamlessly stores, indexes, and manages flexible data entirely in XQuery. The data itself is stored in MongoDB. The platform leverages MongoDB’s indexes, sharding, and consistency guarantees to scale-out horizontally. The talk will conclude by showing a benchmark of the platform and discuss perspectives of the outlined approach.

TRANSCRIPT

Page 1: Scalable XQuery Processing with Zorba on top of MongoDB

Scalable XQuery ProcessingZorba Meets MongoDBWilliam Candillon {[email protected]}

msec

Page 2: Scalable XQuery Processing with Zorba on top of MongoDB

Two Drivers

Page 3: Scalable XQuery Processing with Zorba on top of MongoDB

Flexible Data

Page 4: Scalable XQuery Processing with Zorba on top of MongoDB

Scalability

Page 5: Scalable XQuery Processing with Zorba on top of MongoDB

MongoDBCouchBase

BaseXeXist-db

Standardized Query Language X ✔

Modern Query Processing X ✔

Typing X ✔

High Availability ✔ X

Sharding ✔ X

Available as a Service ✔ X

Flexible Data

Scalability

Page 6: Scalable XQuery Processing with Zorba on top of MongoDB

What can XML contribute to JSON Datastores?

Page 7: Scalable XQuery Processing with Zorba on top of MongoDB

A Standardized, Rock Solid Query Language

Page 8: Scalable XQuery Processing with Zorba on top of MongoDB

JSONiq - The SQL of NoSQL 28

Page 9: Scalable XQuery Processing with Zorba on top of MongoDB

JSONiq

• Open Specification: jsoniq.org

• Extension of the mature XQuery for JSON- Joins, Group-by, Filters, Search...

• Leverage the complete XQuery Family- Scripting, Updates, Full-Text

• Standardized Query Language- Run the same code accross multiple JSON stores

28

Page 10: Scalable XQuery Processing with Zorba on top of MongoDB

JSONiq - MongoDB Connector 28

http://28.io/mongodb

Page 11: Scalable XQuery Processing with Zorba on top of MongoDB

What can JSON datastore contribute to XML?

Page 12: Scalable XQuery Processing with Zorba on top of MongoDB

A Distributed and Scalable Store

Page 13: Scalable XQuery Processing with Zorba on top of MongoDB

The Goal

Depth of functionality

Scal

abili

ty &

Per

form

ance

• RDBMS

• memcached

• key/value • MongoDB

• XML DB

28

Page 14: Scalable XQuery Processing with Zorba on top of MongoDB

The Goal

Depth of functionnality

Scal

abili

ty &

Per

form

ance

RDBMS

• memcached

• key/value • MongoDB

• XML DB

• 28msec

28msec - XQuery on top of MongoDB

28

Page 15: Scalable XQuery Processing with Zorba on top of MongoDB

Meet Zorba

• Open Source XQuery Processor- Apache 2 License- Contributors: Oracle, 28msec, FLWOR Foundation

• The Complete Family- XQuery 3.0, Updates, Full-Text, Scripting, JSONiq- XQuery Data Definition Facility

• Pluggable Store API- Run Zorba on your own persistency layer

28

Page 16: Scalable XQuery Processing with Zorba on top of MongoDB

Zorba Architecture 28

Page 17: Scalable XQuery Processing with Zorba on top of MongoDB

Meet MongoDB

• Open Source JSON Document Store- License AGPL 3.0

• Focus on scalability- Replication accross multiple availability zones- Sharding- Atomic updates on documents

• Available as a service- MongoHQ, MongoLab

28

Page 18: Scalable XQuery Processing with Zorba on top of MongoDB

C2 MongoD

Config Servers

C3 MongoD

C1 MongoD

MongoS

App Server

Shard1 Shard2 Shard3

MongoS

App Server

MongoD

Replica set

MongoDB Deployment Example 28

Page 19: Scalable XQuery Processing with Zorba on top of MongoDB

The Goal

Runtime CollectionsXDM Indexes

MongoS CollectionsBSON Indexes

Zorba

MongoDB

28

Page 20: Scalable XQuery Processing with Zorba on top of MongoDB

The Goal

• Seamless XQuery Integration into MongoDB

Runtime CollectionsXDM Indexes

MongoS CollectionsBSON Indexes

Zorba

MongoDB

28

Page 21: Scalable XQuery Processing with Zorba on top of MongoDB

Application Example 28

Page 22: Scalable XQuery Processing with Zorba on top of MongoDB

Application Example 28

• Fetching sports news from XMLTeam.com

• Stored and indexed on MongoDB

• 1 million documents and counting

• Entirely built in XQuery from backend to frontend

• 1k loc, 1 developer, 1 week work

Page 23: Scalable XQuery Processing with Zorba on top of MongoDB

Collection Declarations 28

declare collection sports:docs as document-node();

Page 24: Scalable XQuery Processing with Zorba on top of MongoDB

Compiler Runtime

Store API

MongoDB

Zorba

declare collection ...

Collection Declarations

1.

2.

3.

Compile Query

createCollection(QName)

Create Collection

28

Page 25: Scalable XQuery Processing with Zorba on top of MongoDB

Index Declarations 28

declare %an:value-range index sports:by-datetime on nodes db:collection(xs:QName('sports:docs')) by ./sports-content/sports-metadata/@date-time;

Page 26: Scalable XQuery Processing with Zorba on top of MongoDB

Compiler Runtime

Store API

MongoDB

Zorba

declare index ...

Index Declarations

1.

2.

3.

Compile Query

Create Index

createIndex( qname, ordpath, keys)

28

Page 27: Scalable XQuery Processing with Zorba on top of MongoDB

Insert Nodes 28

let $uri := 'http://xmlteam.com/...'let $doc := http:get($uri)return db:insert-nodes($sports:docs, $doc)

Page 28: Scalable XQuery Processing with Zorba on top of MongoDB

Compiler Runtime

Store API

MongoDB

Zorba

db:insert-nodes(...)

Insert Nodes

1.

2.

3.

Process Query

Insert BSON

insertNode(qname, xdm)

28

Page 29: Scalable XQuery Processing with Zorba on top of MongoDB

MongoDB Store Layer

• Direct XQuery to MongoDB mapping- Collections- Indexes

• Converts XDM to BSON

• Inherits MongoDB consistency model

28

Page 30: Scalable XQuery Processing with Zorba on top of MongoDB

Request Processing on 28msec

Sausalito

MongoDB

Request Handler

Zorba

ELB

DataCompiled Code

HTTP Client

1

2

3

4

5 6

7

8

R

R

Store

ProcessorR

9

28

Availability Zone 1

Page 31: Scalable XQuery Processing with Zorba on top of MongoDB

Scaling Out 28

0

250

500

750

1000

10 40 50 70 80 100 120 150

Number of concurrent requests

Avg Response Time in ms

2 App Servers 4 App Servers

Page 32: Scalable XQuery Processing with Zorba on top of MongoDB

XQuery on Top of MongoDB

• Seamless Integration of XQuery with MongoDB- XDM to BSON- Collections and indexes mapping- Atomicity per document

• 28msec- XQuery Platform on top of MongoDB- Deploy your XQuery apps in 1-click- Scale up & down automatically

28

Page 33: Scalable XQuery Processing with Zorba on top of MongoDB

Take Away

• Two Drivers- Flexible Data- Scalability

• Two Champions- XQuery for Flexible Data- JSON Stores for Scalability

• Two Contributions- JSONiq: The SQL of NoSQL- XQuery Platform on top of MongoDB

28

Page 34: Scalable XQuery Processing with Zorba on top of MongoDB

Thank You!Questions?

msec