benefits of using mongodb over rdbms (at an evening with mongodb minneapolis 3/5/15)

An Evening with MongoDB Minneapolis

March 5, 2015

#MongoDB

• Quick MongoDB Overview

• Benefits Using MongoDB Over RDBMSs

• MongoDB 3.0 Update

• MongoDB Community Update

• More Q&A with MongoDB Experts

Agenda

Benefits Using MongoDB Over RDBMSs

Sr. Solution Architect, MongoDB Inc.

[email protected]

@matthewkalan

Matt Kalan

#MongoDB

• Quick MongoDB Overview

• Benefits using MongoDB over RDBMSs

• What’s New in v3.0

Agenda

Why MongoDB

The World Has Changed

Data

• Volume

• Velocity

• Variety

Time

• Iterative

• Agile

• Short Cycles

Risk

• Always On

• Scale

• Global

Cost

• Open-Source

• Cloud

• Commodity

Expressive

Query

Language

Strong

Consistency

Secondary

Indexes

Flexibility

Scalability

Performance

Relational

NoSQL

Expressive

Query

Language

Strong

Consistency

Secondary

Indexes

Flexibility

Scalability

Performance

Expressive

Query

Language

Strong

Consistency

Secondary

Indexes

Flexibility

Scalability

Performance

Relational NoSQLNexus Architecture

Relational + NoSQL

Future of Operational Databases

2014

RDBMS

Key-Value/

Column Store

OLAP/DW

Hadoop

2000

RDBMS

OLAP/DW

1990

RDBMS

Operational Database

Datawarehousing

Document DB

NoSQL

Match the Data in your Application for Better Performance & Agility

Relational MongoDB

{ customer_id : 1,

first_name : "Mark",

last_name : "Smith",

city : "San Francisco",

phones: [ {

number : “1-212-777-1212”,

dnc : true,

type : “home”

},

{

number : “1-212-777-1213”,

type : “cell”

}]

}

Customer ID

First Name Last Name City

0 John Doe New York

1 Mark Smith San Francisco

2 Jay Black Newark

3 Meagan White London

4 Edward Daniels Boston

Phone Number Type DNCCustomer

ID

1-212-555-1212 home T 0

1-212-555-1213 home T 0

1-212-555-1214 cell F 0

1-212-777-1212 home T 1

1-212-777-1213 cell (null) 1

1-212-888-1212 home F 2

MongoDB Technical Capabilities

Application

Driver

Mongos

Primary

Secondary

Secondary

Shard 1

Primary

Secondary

Secondary

Shard 2

…Primary

Secondary

Secondary

Shard N

db.customer.insert({…})

db.customer.find({

name: ”John Smith”})

1. Dynamic Document Schema

{ name: “John Smith”,

date: “2013-08-01”,

address: “10 3rd St.”,

phone: {

home: 1234567890,

mobile: 1234568138 }

}

2. Native language drivers

4. High performance- Data locality- Indexes- RAM

3. High availability- Replica sets

5. Horizontal scalability- Sharding

Comparing Development in SQL to MongoDB

Adding and testing business features

OR

Integrating with other components, tools, and systems

Database(s)

ETL and other data transfer operations

Messaging

Services (web & other)

Other open source frameworks incl. ORMs

What Are Developers Doing All Day?

Why Can’t We Just Save and Fetch Data?

Because the way we think about data at the

business use case level…

…which traditionally is VERY different than the

way it is implemented at the database level

…is different than the way it is implemented at

the application/code level…

This Problem Isn’t New…

…but for the past 40 years, innovation at the business & application layers

has outpaced innovation at the database layer

1974 2014

Business

Data Goals

Capture my company’s

transactions daily at

5:30PM EST, add them up

on a nightly basis, and print

a big stack of paper

Capture my company’s global transactions in real-time

plus everything that is happening in the world

(customers, competitors, business/regulatory/weather),

producing any number of computed results, and passing

this all in real-time to predictive analytics with model

feedback; results in real-time to 10000s of mobile

devices, multiple GUIs, and b2b and b2c channels

Release

Schedule

Semi-Annually Yesterday

Application

/Code

COBOL, Fortran, Algol,

PL/1, assembler,

proprietary tools

C, C++, VB, C#, Java, javascript, groovy, ruby, perl

python, Obj-C, SmallTalk, Clojure, ActionScript, Flex,

DSLs, spring, AOP, CORBA, ORM, third party software

ecosystem, the whole open source movement, … and

COBOL and Fortran

Database I/VSAM, early RDBMS Mature RDBMS, legacy I/VSAM

Column & key/value stores, and…mongoDB

Exactly How Does MongoDB Change Things?

• MongoDB is designed from the ground up to

address rich structure (maps of maps of lists

of…), not just tables• Standard RDBMS interfaces (i.e. JDBC) do not exploit features

of contemporary languages

• Object Oriented Languages and scripting in Java, C#,

Javascript, Python, Node.js, etc. is impedance-matched to

MongoDB

• In MongoDB, the data is the schema

• Shapes of data go in the same way they come

out

Rectangles are 1974. Maps and Lists are 2014

{ customer_id : 1,




phones: [ {

type : “work”,

number: “1-800-555-1212”

},

{ type : “home”,

number: “1-800-555-1313”,

DNC: true

},


number: “1-800-555-1414”,

DNC: true

}

]

}

An Actual Code Example (Finally!)

Let’s compare and contrast RDBMS/SQL to MongoDB

development using Java over the course of a few weeks.

Some ground rules:1. Observe rules of Software Engineering 101: assume separation of application,

data access layer (DAL), and persistor implementation

2. DAL must be able to

a. Expose simple, functional, data-only interfaces to the application

• No ORM, frameworks, compile-time bindings, special tools

b. Exploit high performance features of the persistor

3. Focus on core data handling code and avoid distractions that require the same

amount of work in both technologies

a. No exception or error handling

b. Leave out DB connection and other setup resources

4. Day counts are a proxy for progress, not actual time to complete indicated task

The Task: Saving and Fetching Contact data

Map m = new HashMap();

m.put(“name”, “matt”);

m.put(“id”, “K1”);

Start with this simple,

flat shape in the Data

Access Layer:

save(Map m)And assume we

save it in this way:

Map m = fetch(String id)And assume we

fetch one by primary

key in this way:

Brace yourself…..

Day 1: Initial efforts for both technologies

DDL: create table contact ( … )

init()

{

contactInsertStmt = connection.prepareStatement

(“insert into contact ( id, name ) values ( ?,? )”);

fetchStmt = connection.prepareStatement

(“select id, name from contact where id = ?”);

}

save(Map m)

{

contactInsertStmt.setString(1, m.get(“id”));

contactInsertStmt.setString(2, m.get(“name”));

contactInsertStmt.execute();

}

Map fetch(String id)

{

Map m = null;

fetchStmt.setString(1, id);

rs = fetchStmt.execute();

if(rs.next()) {

m = new HashMap();

m.put(“id”, rs.getString(1));

m.put(“name”, rs.getString(2));

}

return m;

}

SQLDDL: none

save(Map m)

{

collection.insert(

new BasicDBObject(m));

}

MongoDB


{

Map m = null;

DBObject dbo = new BasicDBObject();

dbo.put(“id”, id);

c = collection.find(dbo);

if(c.hasNext()) }

m = (Map) c.next();

}

return m;

}

Day 2: Add simple fields


m.put(“id”, “K1”);

m.put(“title”, “Mr.”);

m.put(“hireDate”, new Date(2011, 11, 1));

• Capturing title and hireDate is part of adding a new

business feature

• It was pretty easy to add two fields to the structure

• …but now we have to change our persistence code

Brace yourself (again) …..

SQL Day 2 (changes in bold)DDL: alter table contact add title varchar(8);

alter table contact add hireDate date;

init()

{


(“insert into contact ( id, name, title, hiredate ) values

( ?,?,?,? )”);


(“select id, name, title, hiredate from contact where id =

?”);

}

save(Map m)

{



contactInsertStmt.setString(3, m.get(“title”));

contactInsertStmt.setDate(4, m.get(“hireDate”));


}


{

Map m = null;



if(rs.next()) {

m = new HashMap();



m.put(“title”, rs.getString(3));

m.put(“hireDate”, rs.getDate(4));

}

return m;

}

Consequences:1. Code release schedule linked

to database upgrade (new

code cannot run on old

schema)

2. Issues with case sensitivity

starting to creep in (many

RDBMS are case insensitive

for column names, but code is

case sensitive)

3. Changes require careful mods

in 4 places

4. Beginning of technical debt

MongoDB Day 2

save(Map m)

{

collection.insert(m);

}


{

Map m = null;




if(c.hasNext()) }

m = (Map) c.next();

}

return m;

}

Advantages:1. Zero time and money spent on

overhead code

2. Code and database not physically

linked

3. New material with more fields can

be added into existing collections;

backfill is optional

4. Names of fields in database

precisely match key names in

code layer and directly match on

name, not indirectly via positional

offset

5. No technical debt is created

✔ NO

CHANGE

Day 3: Add list of phone numbers


m.put(“id”, “K1”);

m.put(“title”, “Mr.”);

m.put(“hireDate”, new Date(2011, 11,

1));

n1.put(“type”, “work”);

n1.put(“number”, “1-800-555-1212”));

list.add(n1);

n2.put(“type”, “home”));

n2.put(“number”, “1-866-444-3131”));

list.add(n2);

m.put(“phones”, list);• It was still pretty easy to add this data to the structure

• .. but meanwhile, in the persistence code …

REALLY brace yourself…

SQL Day 3 changes: Option 1: Assume just 1 work and 1 home phone numberDDL: alter table contact add work_phone varchar(16);

alter table contact add home_phone varchar(16);

init()

{


(“insert into contact ( id, name, title, hiredate,

work_phone, home_phone ) values ( ?,?,?,?,?,? )”);


(“select id, name, title, hiredate, work_phone,

home_phone from contact where id = ?”);

}

save(Map m)

{





for(Map onePhone : m.get(“phones”)) {

String t = onePhone.get(“type”);

String n = onePhone.get(“number”);

if(t.equals(“work”)) {

contactInsertStmt.setString(5, n);

} else if(t.equals(“home”)) {

contactInsertStmt.setString(6, n);

}

}


}


{

Map m = null;



if(rs.next()) {

m = new HashMap();





Map onePhone;

onePhone = new HashMap();

onePhone.put(“type”, “work”);

onePhone.put(“number”, rs.getString(5));

list.add(onePhone);

onePhone = new HashMap();

onePhone.put(“type”, “home”);


list.add(onePhone);

m.put(“phones”, list);

}

This is just plain bad….

SQL Day 3 changes: Option 2:Proper approach with multiple phone numbersDDL: create table phones ( … )

init()

{


(“insert into contact ( id, name, title, hiredate )

values ( ?,?,?,? )”);

c2stmt = connection.prepareStatement(“insert into

phones (id, type, number) values (?, ?, ?)”;


(“select id, name, title, hiredate, type, number from

contact, phones where phones.id = contact.id and

contact.id = ?”);

}

save(Map m)

{

startTrans();






c2stmt.setString(1, m.get(“id”));

c2stmt.setString(2, onePhone.get(“type”));

c2stmt.setString(3, onePhone.get(“number”));

c2stmt.execute();

}


endTrans();

}


{

Map m = null;



int i = 0;

List list = new ArrayList();

while (rs.next()) {

if(i == 0) {

m = new HashMap();





m.put(“phones”, list);

}

Map onePhone = new HashMap();

onePhone.put(“type”, rs.getString(5));


list.add(onePhone);

i++;

}

return m;

}

This took time and money

SQL Day 5: Zombies! (zero or more between entities)

init()

{



values ( ?,?,?,? )”);




(“select A.id, A.name, A.title, A.hiredate, B.type,

B.number from contact A left outer join phones B on

(A.id = B. id) where A.id = ?”);

}

Whoops! And it’s also wrong!We did not design the query accounting

for contacts that have no phone number.

Thus, we have to change the join to an

outer join.

But this ALSO means we have to change

the unwind logic

This took more time and

money!

while (rs.next()) {

if(i == 0) {

// …

}

String s = rs.getString(5);

if(s != null) {

Map onePhone = new HashMap();

onePhone.put(“type”, s);


list.add(onePhone);

}

}

…but at least we have a DAL…

right?

MongoDB Day 3


overhead code

2. No need to fear fields that are

“naturally occurring” lists

containing data specific to the

parent structure and thus do not

benefit from normalization and

referential integrity

3. Safe from zombies and other

undead distractions from productivity

save(Map m)

{


}


{

Map m = null;




if(c.hasNext()) }

m = (Map) c.next();

}

return m;

}

✔ NO

CHANGE

By Day 14, our structure looks like this:

n4.put(“geo”, “US-EAST”);

n4.put(“startupApps”, new String[] { “app1”, “app2”, “app3” } );

list2.add(n4);

n4.put(“geo”, “EMEA”);

n4.put(“startupApps”, new String[] { “app6” } );

n4.put(“useLocalNumberFormats”, false):

list2.add(n4);

m.put(“preferences”, list2)

n6.put(“optOut”, true);

n6.put(“assertDate”, someDate);

seclist.add(n6);

m.put(“attestations”, seclist)

m.put(“security”, mapOfDataCreatedByExternalSource);

• It was still pretty easy to add this data to the structure

• Want to guess what the SQL persistence code looks like?

• How about the MongoDB persistence code?

SQL Day 14

Error: Could not fit all the code into this space.

…actually, I didn’t want to spend 2 hours putting the code together..

But very likely, among other things:

• n4.put(“startupApps”,new String[]{“app1”,“app2”,“app3”});

was implemented as a single semi-colon delimited string

• m.put(“security”, anotherMapOfData);

was implemented by flattening it out and storing a subset of fields

MongoDB Day 14 – and every other day


overhead code

2. Persistence is so easy and flexible

and backward compatible that the

persistor does not upward-

influence the shapes we want to

persist i.e. the tail does not wag

the dog

save(Map m)

{


}


{

Map m = null;




if(c.hasNext()) }

m = (Map) c.next();

}

return m;

}

✔ NO

CHANGE

But what if we must do a join?

Both RDBMS and MongoDB will have a PhoneTransactions

table/collection{ customer_id : 1,




phones: [ {

type : “work”,

number: “1-800-555-1212”

},


number: “1-800-555-1313”,

DNC: true

},


number: “1-800-555-1414”,

DNC: true

}

]

}

{ number: “1-800-555-1212”,

target: “1-999-238-3423”,

duration: 20

}

{ number: “1-800-555-1212”,

target: “1-444-785-6611”,

duration: 243

}

{ number: “1-800-555-1414”,

target: “1-645-331-4345”,

duration: 132

}

{ number: “1-800-555-1414”,

target: “1-990-875-2134”,

duration: 71

}

PhoneTransactions

SQL Join Attempt #1

select A.id, A.lname, B.type, B.number, C.target, C.duration

from contact A, phones B, phonestx C

Where A.id = B.id and B.number = C.number

id | lname | type | number | target | duration

-----+--------------+------+----------------+----------------+----------

g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23

g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7

g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9

g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7

g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23

g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7

g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23

g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9

g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9

g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9

How to turn this into a list of names –

each with a list of numbers, each of those with a list of target

numbers?

SQL Unwind Attempt #1

Map idmap = new HashMap();

ResultSet rs = fetchStmt.execute();

while (rs.next()) {

String id = rs.getString(“id");

String nmbr = rs.getString("number");

List tnum;

Map snum;

if((snum = (Map) idmap.get(id)) == null) {

snum = new HashMap();

idmap.put(id, snum);

}

if((tnum = snum.get(nmbr)) == null) {

tnum = new ArrayList();

snum.put(number, tnum);

}

Map info = new HashMap();

info.put("target", rs.getString("target"));

info.put("duration", rs.getInteger("duration"));

tnum.add(info);

}

// idmap[“g9”][“1-900-555-1212”] = [{target:1-222-707-7070,duration:23…},…]

SQL Join Attempt #2

select A.id, A.lname, B.type, B.number, C.target, C.duration

Fromcontact A, phones B, phonestx C

Where A.id = B.id and B.number = C.number order by A.id, B.number

id | lname | type | number | target | duration

-----+--------------+------+----------------+----------------+----------

g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9

g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23

g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7

g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9

g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23

g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7

g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9

g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7

g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9

g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23

“Early bail out” from cursor is now possible –

but logic to construct list of source and target numbers is similar

SQL is about Disassembly

String s = “select A, B, C, D,

E, F from T1,T2,T3 where T1.col

= T2.col and T2.col2 = T3.col2

and X = Y and X2 != Y2 and G >

10 and G < 100 and TO_DATE(‘ …”;

while(ResultSet.next()) {

if(new column1 value from T1) {

set up new Object;

}


set up new Object2

}


set up new Object3

}

populate maps, lists and scalars

}

ResultSet rs = execute(s);

Design a Big Query

including business logic

to grab all the data up

front

Throw it at the engine

Disassemble Big

Rectangle into usable

objects with logic implicit

in change in column

values

MongoDB is about Assembly

Cursor c = coll1.find({“X”:”Y”});

while(c.hasNext()) {

populate maps, lists and scalars;

Cursor c2 = coll2.find(logic+key from c);

while(c2.hasNext()) {


Cursor c3 = coll3.find(logic+key from c2);



}

}

Assemble usable

objects incrementally

with explicit logic

MongoDB ”Join”

Map idmap = new HashMap();

DBCursor c = contacts.find();

while(c.hasNext()) {

DBObject item = c.next();

String id = item.get(“id”);

Map nummap = new HashMap();

for(Map phone : (List)item.get(”phones”)) {

String pnum = phone.get(“number”);

DBObject q = new BasicDBObject(“number”, pnum);

DBCursor c2 = phonestx.find(q);

List txs = new ArrayList();


txs.add((Map)c2.next());

}

nummap.put(pnum, txs);

}

idmap.put(id, nummap);

}

// idmap[“g9”][“1-900-555-1212”] = [{target:1-222-707-7070,duration:23…},…]

But what about “real” queries?

• MongoDB query language is a physical map-of-

map based structure, not a String• Operators (e.g. AND, OR, GT, EQ, etc.) and arguments are

keys and values in a cascade of Maps

• No grammar to parse, no templates to fill in, no whitespace,

no escaping quotes, no parentheses, no punctuation

• Same paradigm to manipulate data is used to

manipulate query expressions

• …which is also, by the way, the same paradigm

for working with MongoDB metadata and

explain()

MongoDB Query Examples

SQL CLI select * from contact A, phones B where

A.did = B.did and B.type = 'work’;

MongoDB CLI db.contact.find({"phones.type”:”work”});

SQL in Java String s = “select * from contact A, phones B

where A.did = B.did and B.type = \'work\’”;

ResultSet rs = execute(s);

MongoDB via

Java driver

DBObject expr = new BasicDBObject();

expr.put(“phones.type”, “work”);

Cursor c = contact.find(expr);

Find all contacts with at least one work phone


SQL select A.did, A.lname, A.hiredate, B.type,

B.number from contact A left outer join phones B

on (B.did = A.did) where b.type = 'work' or

A.hiredate > '2014-02-02'::date

MongoDB CLI db.contacts.find({"$or”: [

{"phones.type":”work”},

{"hiredate": {”$gt": new ISODate("2014-02-

02")}}

]});

Find all contacts with at least one work phone or

hired after 2014-02-02


MongoDB via

Java driver

List arr = new ArrayList();

Map phones = new HashMap();

phones.put(“phones.type”, “work”);

arr.add(phones);

Map hdate = new HashMap();

java.util.Date d = dateFromStr(“2014-02-02”);

hdate.put(“hiredate”, new BasicDBObject(“$gt”,d));

arr.add(hdate);

Map m1 = new HashMap();

m1.put(“$or”, arr);

contact.find(new BasicDBObject(m1));

Find all contacts with at least one work phone or

hired after 2014-02-02

…and before you ask…

Yes, MongoDB query expressions

support

1. Sorting

2. Cursor size limit

3. Projection (asking for only parts of the rich

shape to be returned)

4. Aggregation (“GROUP BY”) functions

Day 30: RAD on MongoDB with Pythonimport pymongo

def save(data):

coll.insert(data)

def fetch(id):

return coll.find_one({”id": id } )

myData = {

“name”: “jane”,

“id”: “K2”,

# no title? No problem

“hireDate”: datetime.date(2011, 11, 1),

“phones”: [

{ "type": "work",

"number": "1-800-555-1212"

},

{ "type": "home",

"number": "1-866-444-3131"

}

]

}

save(myData)

print fetch(“K2”)

expr = {"$or": [ {"phones.type": “work”}, {”hiredate": {“$gt”: datetime.date(2014,2,2)}} ]}

for c in coll.find(expr):

print [ k.upper() for k in sorted(c.keys()) ]

Advantages:

1. Far easier and faster to create

scripts due to “fidelity-parity” of

mongoDB map data and python

(and Perl, Ruby, and Javascript)

structures

1. Data types and structure in scripts

are exactly the same as that read and

written in Java and C++

Day 30: Polymorphic RAD on MongoDB with Pythonimport pymongo

item = fetch("K8")

# item is:

{

“name”: “bob”,

“id”: “K8”,

"personalData": {

"preferedAirports": [ "LGA", "JFK" ],

"travelTimeThreshold": { "value": 3,

"units": “HRS”}

}

}

item = fetch("K9")

# item is:

{

“name”: “steve”,

“id”: “K9”,

"personalData": {

"lastAccountVisited": {

"name": "mongoDB",

"when": datetime.date(2013,11,4)

},

"favoriteNumber": 3.14159

}

}

Advantages:

1. Scripting languages easily digest

shapes with common fields and

dissimilar fields

2. Easy to create an information

architecture where placeholder fields

like personalData are “known” in the

software logic to be dynamic

Day 30: (Not) RAD on top of SQL with Pythoninit()

{



values ( ?,?,?,? )”);




(“select id, name, title, hiredate, type, number from

contact, phones where phones.id = contact.id and

contact.id = ?”);

}

save(Map m)

{

startTrans();






c2stmt.setString(1, onePhone.get(“type”));

c2stmt.setString(2, onePhone.get(“number”));

c2stmt.execute();

}


endTrans();

}

Consequences:

1. All logic coded in Java interface

layer (unwinding contact, phones,

preferences, etc.) needs to be

rewritten in python (unless Jython

is used) … AND/or perl, C++,

Scala, etc.

2. No robust way to handle

polymorphic data other than

BLOBing it

3. …and that will take real time and

money!

The Fundamental Change with mongoDB

RDBMS designed in era when:

• CPU and disk was slow &

expensive

• Memory was VERY expensive

• Network? What network?

• Languages had limited means to

dynamically reflect on their types

• Languages had poor support for

richly structured types

Thus, the database had to

• Act as combiner-coordinator of

simpler types

• Define a rigid schema

• (Together with the code) optimize

at compile-time, not run-time

In mongoDB, the

data is the schema!

Lastly: A CLI with teeth

> db.contact.find({"SeqNum": {"$gt”:10000}}).explain();

{

"cursor" : "BasicCursor",

"n" : 200000,

//...

"millis" : 223

}

Try a query and show the

diagnostics

> for(v=[],i=0;i<3;i++) {

… n = i*50000;

… expr = {"SeqNum": {"$gt”: n}};

… v.push( [n, db.contact.find(expr).explain().millis)] }

Run it 3 times with smaller and

smaller chunks and create a

vector of timing result pairs

(size,time)

> v

[ [ 0, 225 ], [ 50000, 222 ], [ 100000, 220 ] ]

Let’s see that vector

> load(“jStat.js”)

> jStat.stdev(v.map(function(p){return p[1];}))

2.0548046676563256

Use any other javascript you

want inside the shell

> for(i=0;i<3;i++) {

… expr = {"SeqNum": {"$gt":i*1000}};

… db.foo.insert(db.contact.find(expr).explain()); }

Party trick: save the explain()

output back into a collection!

What Does All This Add Up To?

• MongoDB easier than RDBMS/SQL for real

problems

• Quicker to change

• Much better harmonized with modern languages

• Comprehensive indexing (arbitrary non/unique

secondaries, compound keys, geospatial, text

search, TTL, etc….)

• Horizontally scalable to petabytes

• Isomorphic HA and DRModern Database for Modern

Solutions

+

=

What’s New in MongoDB 3.0

• WiredTiger Storage Engine and Flexible Storage Architecture

• Ops Manager

• Enhanced Query Language and Tools

• Advanced Security and Auditing

• Low-Latency Experience Across the Globe

MongoDB 3.0 Headlines

Pluggable Storage API New Storage Engine: WiredTiger

Flexible Storage Architecture

● Vision: Many storage engines optimized for many different use cases

● One data model, one API, one set of operational concerns – but under

the hood, many options for every use case under the sun

Content

Repo

IoT Sensor

BackendAd Service

Customer

AnalyticsArchive

MongoDB Query Language (MQL) + Native Drivers

MongoDB Document Data Model

MMAP V1 WT In-Memory HDFSProprietary

Storage

Supported in MongoDB 3.0 Future Possible Storage Engines

Managem

ent

Se

curity

Example Future State

Experimental in

MongoDB 3.0

WiredTiger Storage Engine

• Same data model, same query

language, same ops

• Write performance gains driven by

document-level concurrency control

• Storage savings driven by native

compression

• 100% backwards compatible

• Non-disruptive upgradeMongoDB 3.0MongoDB 2.6

Performance

Same great database…

MongoDB WiredTiger MongoDB MMAPv1

Write Performance ExcellentDocument-Level Concurrency

Control

GoodCollection-Level Concurrency

Control

Read Performance Excellent Excellent

Compression Support Yes No

MongoDB Query Language Support Yes Yes

Secondary Index Support Yes Yes

Replication Support Yes Yes

Sharding Support Yes Yes

Ops Manager & MMS Support Yes Yes

Security Controls Yes Yes

Platform Availability Linux, Windows, Mac OS X Linux, Windows, Mac OS X,

Solaris (x86)

*GridFS supports larger file sizes

7x-10x Higher Performance

• Document-level concurrency control

• Improved vertical scalability and performance predictability

• Especially good for write-intensive apps, e.g.,

Internet of

Things (IoT)

Messaging

Apps

Log Data Tick Data

50%-80% Less Storage via Compression

• Better storage utilization

• Higher I/O scalability

• Multiple compression options

– Snappy

– zlib

– None

• Data and journal compressed on disk

• Indexes compressed on disk and in memory

Ops Manager

Single-click provisioning, scaling & upgrades, admin tasks

Monitoring, with charts, dashboards and alerts on 100+ metrics

Backup and restore, with point-in-time recovery, support for sharded clusters

MongoDB Ops Manager

The Best Way to Manage MongoDB In Your Data Center

Up to 95% Reduction in Operational Overhead

Integrates with Existing Infrastructure

How Ops Manager Helps You

Scale EasilyMeet SLAs

Best Practices,

Automated

Cut

Management

Overhead

Security and Tools Enhancements

Enhanced Query Language and Tools

• Faster Loading and Export

• Easier Query Optimization

• Faster Debugging

• Richer Geospatial Apps

• Better Time-Series Analytics

Enterprise-Grade Security

• Authentication: LDAP,

Kerberos, x.509, SCRAM

• Authorization: Fine-grained

role based access control;

field level redaction

• Encryption: In motion via

SSL, at rest via partner

solution (e.g., Vormetric)

Native Auditing for Any Operation

• Essential for many compliance standards (e.g., PCI DSS, HIPAA, NIST 800-

53, European Union Data Protection Directive)

• MongoDB Native Auditing

– Construct and filter audit trails for any operation against the database,

whether DML, DCL or DDL

– Can filter by user or action

– Audit log can be written to multiple destinations

Low-Latency Experience Across the Globe

Low-Latency Experience – Anywhere

• Amazon – Every 1/10 second delay resulted in 1% loss of sales

• Google – Half a second delay caused a 20% drop in traffic

• Aberdeen Group – 1-second delay in page-load time

– 11% fewer page views

– 16% decrease in customer satisfaction

– 7% loss in conversions

NYC SF London Sydney

NYC -- 84 69 231

SF 84 -- 168 158

London 69 168 -- 315

Sydney 231 158 315 --

Network Latencies Between Cities (ms)

Low-Latency via Large Replica Sets

MongoDB 3.0 Supports Core Proposition

Reduce Risk for Mission-Critical

Deployments

• Ops Manager automated best

practices, zero-downtime ops

• Auditing in compliance

• Flexible Storage Architecture future

proof

• 7x-10x Performance meet SLAs

Lower TCO

• Vertical scalability server utilization

• Compression 80% storage utilization

• Ops Manager lower cost to manage

Accelerate Time-to-Value

• Enhanced Query Language and Tools

less coding required

• Ops Manager up and running quickly,

decrease ops effort by 95%

• 7x-10x Performance easier to scale

Leverage Data + Tech for

Competitive Advantage

• 7x-10x Performance + Ops Manager +

Flexible Storage Architecture

MongoDB suitable for more use cases

We Are Here To Help

MongoDB Enterprise AdvancedThe best way to run MongoDB in your data center

MongoDB Management Service (MMS)The easiest way to run MongoDB in the cloud

Production SupportIn production and under control

Development SupportLet’s get you running

ConsultingWe solve problems

TrainingGet your teams up to speed

MongoDB Community Update

Thank You

Sr. Solution Architect, MongoDB [email protected]@matthewkalan

Matt Kalan

#MongoDB

benefits of using mongodb over rdbms (at an evening with mongodb minneapolis 3/5/15)

Data & Analytics

mongodb expertsagendabenefits

mongodb minneapolismarch

data tie

data access

functional apps mongodb

unstructuredunstructured

enterprise data

flexible data modelsthey