query languages for document stores

42
© 2013 triAGENS GmbH | 2013-06-18 Query Languages for Document Stores 2013-06-18 Jan Steemann

Upload: interactivecologne

Post on 29-Aug-2014

563 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Query Languagesfor Document Stores

2013-06-18

Jan Steemann

Page 2: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

me

I'm a software developer working at triAGENS GmbH, CGN on - a document store

Page 3: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Documents

Page 4: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Documents

documents are self-contained, aggregate data structures...

...consisting of named and typed attributes,which can be nested / hierarchical

documents can be used to model complex business objects

Page 5: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Example order document

{   "id": "abc­100­22",   "date": "2013­04­26"   "customer": {    "id": "c­199­023",    "name": "acme corp."  },  "items": [ {       "id": "p­123",      "quantity": 1,      "price": 25.13

  } ]}  

Page 6: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Document stores

document stores are databases specialised in handling documents

they've been around for a while got really popular with the NoSQL buzz

(CouchDB, MongoDB, ...)

Page 7: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Why use Document Stores?

Page 8: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Saving programming language data

document stores allow saving a programming language object as a whole

your programming language object becomes a document in the database, without the need for much transformation

compare this to saving data in a relational database...

Page 9: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Persistence the relational way

orders

id date

1 2013-04-20

2 2013-04-21

3 2013-04-21

4 2013-04-22

customers

customer

c1

c2

c1

c3

id name

c1

c2

c3

acme corp.

sample.com

abc co.

orderitems

1

order item

1

price quantity

23.25 1

Page 10: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Benefits of document stores

no impedance mismatch, no complex object-relational mapping,no normalisation requirements

querying documents is often easier and faster than querying highly normalised relational data

Page 11: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Schema-less

in document stores, there is no "table"-schema as in the relational world

each document can have different attributes there is no such thing as ALTER TABLE that's why document stores are called

schema-less or schema-free

Page 12: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Querying Document Stores

Page 13: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Querying by document id is easy

every document store allows querying a single document at a time

accessing documents by their unique ids is almost always dead-simple

Page 14: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Complex queries?

what if you want to run complex queries (e.g. projections, filters, aggregations, transformations, joins, ...)??

let's check the available options in some of the popular document stores

Page 15: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

CouchDB: map-reduce

querying by something else than document key / id requires writing a view

views are JavaScript functions that are stored inside the database

views are populated by incremental map-reduce

Page 16: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

map-reduce

the map function is applied on each document (that changed)

map can filter out non-matching documents or emit modified or unmodified versions of them emitted documents can optionally be passed into

a reduce function reduce is called with groups of similar

documents and can thus perform aggregation

Page 17: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

CouchDB map-reduce example

map = function (doc) {  var i, n = doc.orderItems.length;  for (i = 0; i < n; ++i) {    emit(doc.orderItems[i], 1);  }};

reduce = function (keys, values, rereduce) {  if (rereduce) {    return sum(values);

  }  return values.length;};

Page 18: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

map-reduce

map-reduce is generic and powerful provides a programming language need to create views for everything that is

queried access to a single "table" at a time (no

cross-"table" views) a bit clumsy for ad-hoc exploratory queries

Page 19: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

MongoDB: find()

ad-hoc queries in MongoDB are much easier can directly apply filters on collections,

allowing to find specific documents easily:mongo> db.orders.find({   "customer": {     "id": "c1",    "name": "acme corp."  }});

Page 20: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

MongoDB: complex filters

can filter on any document attribute or sub-attribute

indexes will automatically be used if present nesting filters allows complex queries quite flexible and powerful, but tends to be

hard to use and read for more complex queries

Page 21: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

MongoDB: complex filtering

mongo> db.users.find({   "$or": [     {       "active": true     },     {       "age": {         "$gte": 40       }     }   ]});

Page 22: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

MongoDB: more options

can also use JavaScript functions for filtering, or JavaScript map-reduce

several aggregation functions are also provided

neither option allows running cross-"table" queries

Page 23: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Why not use a QueryLanguage?

Page 24: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Query languages

a good query language should allow writing both simple and complex

queries, without having to switch the methodology

provide the required features for filtering, aggregation, joining etc.

hide the database internals

Page 25: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

SQL

in the relational world, there is one accepted general-purpose query language: SQL

it is quite well-known and mature: 35+ years of experience many developers and established tools

around it standardised (but mind the "dialects"!)

Page 26: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

SQL in document stores?

SQL is good at handling relational data not good at handling multi-valued or

hierchical attributes, which are common in documents

(too) powerful: SQL provides features many document stores intentionally lack (e.g. joins, transactions)

SQL has not been adopted by document stores yet

Page 27: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Query Languagesfor Document Stores

Page 28: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

XQuery?

XQuery is a query and programming language

targeted mainly at processing XML data can process hierarchical data very powerful and extensible W3C recommendation

Page 29: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

XQuery

XQuery has found most adoption in the area of XML processing

today people want to use JSON, not XML XQuery not available in popular document

stores

Page 30: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

ArangoDB Query Language (AQL)

ArangoDB provides AQL, a query language made for JSON document processing

it allows running complex queries on documents, including joins and aggregation

language syntax was inspired by XQuery and provides similar concepts such as FOR, LET, RETURN, ...

the language integrates JSON "naturally"

Page 31: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

AQL example

FOR order IN orders

  FILTER order.status == "processed"

  LET itemsValue = SUM((    FOR item IN order.items      FILTER item.status == "confirmed"      RETURN item.price * item.quantity  ))

  FILTER itemsValue >= 500

  RETURN {    "items"      : order.items,    "itemsValue" : itemsValue,    "itemsCount" : LENGTH(order.items)  }

Page 32: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

AQL: some features

queries can combine data from multiple "tables"

this allows joins using any document attributes or sub-attributes

indexes will be used if present

Page 33: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

AQL: join example

FOR user IN users

  FILTER user.id == 1234

  RETURN {    "user"  : user,    "posts" : (FOR post IN blogPosts

      FILTER post.userId == user.id &&             post.date >= '2013­06­13'            

      RETURN post    )  }

Page 34: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

AQL: additional features

AQL provides basic functionality to query graphs, too

the language can be extended with user-defined JavaScript functions

Page 35: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

JSONiq

JSONiq is a data processing and query language for handling JSON data

it is based on XQuery, thus provides the same FLWOR expressions: FOR, LET, WHERE, ORDER, ...

JSON is integrated "naturally" most of the XML handling is removed

Page 36: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

JSONiq: example

for $order in collection("orders")

  where $order.customer.id eq "abc­123"

  return {    customer : $order.customer,    items    : $order.items  }

Page 37: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

JSONiq: join example

for $post in collection("posts")

  let $postId := $post.id

  for $comment in collection("comments")

    where $comment.postId eq $postId

    group by $postId

    order by count($comment) descending

    return {      id       : $postId,      comments : count($comment)    }

Page 38: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

JSONiq

JSONiq is a generic, database-agnostic language

it can be extended with user-defined XQuery functions

JSONiq is currently not implemented inside any document database...

Page 39: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

JSONiq

...but it can be used via a service (at 28.io) the service provides the JSONiq query

language and implements functionality not provided by a specific database

such features are implemented client-side, e.g. joins for MongoDB

Page 40: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Summary

Page 41: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Summary

today's document stores provide different, proprietary mechanisms for querying data

there is currently no standard query mechanism for document stores as there is in the relational world (SQL)

Page 42: Query Languages for Document Stores

© 2013 triAGENS GmbH | 2013-06-18

Summary

you CAN use query languages in document stores today, e.g. AQL and JSONiq

if you like the idea, give them a try, provide feedback and contribute!