transmart community meeting 5-7 nov 13 - session 2: mongodb: what, why and when

62
What, When and Why of MongoDB Solution Architect, MongoDB Inc. Massimo Brignoli @mongodb

Upload: david-peyruc

Post on 19-Jan-2015

650 views

Category:

Health & Medicine


0 download

DESCRIPTION

tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When Massimo Brignoli, MongoDB Inc The presentation will illustrate what MongoDB is, the advantages of the document based approach and some of the use cases where MongoDB is a perfect fit.

TRANSCRIPT

Page 1: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

What, When and Why of MongoDB

Solution Architect, MongoDB Inc.

Massimo Brignoli

@mongodb

Page 2: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Agenda

About MongoDB Inc.

Data and Query Model

Scalability

Availability

Deployment Architectures

Schema Design Challenges

Use Cases

Page 3: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

About MongoDB

Page 4: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

MongoDB Inc. Overview

300+ employees 600+ customers

Offices in New York, Palo Alto, Washington DC, London, Dublin,

Barcelona and SydneyOver $231 million in funding

Page 5: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

6,000,000+ MongoDB Downloads

100,000+ Online Education Registrants

20,000+ MongoDB User Group Members

20,000+ MongoDB Days Attendees

15,000+ MongoDB Management Service (MMS) Users

Global Community

Page 6: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

MongoDB Inc. Products and Services

TrainingOnline and In-Person for Developers and Administrators

MongoDB Monitoring ServiceCloud-Based Service for Monitoring, Alerts, Backup and Restore

SubscriptionsMongoDB Enterprise, On-Prem Monitoring, Professional Support and Commercial License

ConsultingExpert Resources for All Phases of MongoDB Implementations

Page 7: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Data & Query Model

Page 8: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Operational Database Landscape

Page 9: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Document Data Model

Relational MongoDB

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Page 10: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Document Model Benefits

• Agility and flexibility– Data models can evolve easily– Companies can adapt to changes quickly

• Intuitive, natural data representation– Developers are more productive– Many types of applications are a good fit

• Reduces the need for joins, disk seeks– Programming is more simple– Performance can be delivered at scale

Page 11: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Developers are more productive

Page 12: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Developers are more productive

Page 13: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Developers are more productive

Page 14: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

MongoDB is full featured

MongoDBRich Queries

• Find Paul’s cars• Find everybody in London with a

car built between 1970 and 1980

Geospatial• Find all of the car owners within

5km of Trafalgar Sq.

Text Search• Find all the cars described as

having leather seats

Aggregation• Calculate the average value of

Paul’s car collection

Map Reduce• What is the ownership pattern of

colors by geography over time? (is purple trending up in China?)

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Page 15: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Shell and Drivers

Shell

Command-line shell for

interacting directly with

database

DriversDrivers for most popular programming languages and frameworks

> db.collection.insert({company:“10gen”, product:“MongoDB”})> > db.collection.findOne(){

“_id” : ObjectId(“5106c1c2fc629bfe52792e86”),

“company” : “10gen”“product” : “MongoDB”

}

Java

Python

Perl

Ruby

Haskell

JavaScript

Page 16: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Scalability

Page 17: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Automatic Sharding

• Three types of sharding: hash-based, range-based, tag-aware

• Increase or decrease capacity as you go

• Automatic balancing

Page 18: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Query Routing

• Multiple query optimization models

• Each sharding option appropriate for different apps

Page 19: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Availability

Page 20: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

High Availability – Ensure application availability

during many types of failures

Disaster Recovery – Address the RTO and RPO goals

for business continuity

Maintenance – Perform upgrades and other

maintenance operations with no application downtime

Availability Considerations

Page 21: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Replica Sets

• Replica Set – two or more copies

• “Self-healing” shard

• Addresses many concerns:

- High Availability

- Disaster Recovery

- Maintenance

Page 22: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Replica Set Benefits

Business Needs Replica Set Benefits

High Availability Automated failover

Disaster Recovery Hot backups offsite

Maintenance Rolling upgrades

Low Latency Locate data near users

Workload Isolation Read from non-primary replicas

Data Privacy Restrict data to physical location

Data Consistency Tunable Consistency

Page 23: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Deployment Architectures

Page 24: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Single Data Center

• Automated failover

• Tolerates server failures

• Tolerates rack failures

• Number of replicas defines failure tolerance

Primary – A Primary – B Primary – C

Secondary – A

Secondary – A

Secondary – B

Secondary – B

Secondary – C

Secondary – C

Page 25: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Active/Standby Data Center

• Tolerates server and rack failure

• Standby data center

Data Center - West

Primary – A Primary – B Primary – C

Secondary – A

Secondary – B

Secondary – C

Data Center - East

Secondary – A

Secondary – B

Secondary – C

Page 26: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Active/Active Data Center

• Tolerates server, rack, data center failures, network partitions

Data Center - West

Primary – A Primary – B Primary – C

Secondary – A

Secondary – B

Secondary – C

Data Center - East

Secondary – A

Secondary – B

Secondary – C

Secondary – B

Secondary – C

Secondary – A

Data Center - Central

Arbiter – A Arbiter – B Arbiter – C

Page 27: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Global Data Distribution

Real-time

Real-time Real-time

Real-time

Real-time

Real-time

Real-time

Primary

Secondary

Secondary

Secondary

Secondary

Secondary

Secondary

Secondary

Page 28: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Read Global/Write Local

Primary:NYC

Secondary:NYC

Primary:LON

Primary:SYD

Secondary:LON

Secondary:NYC

Secondary:SYD

Secondary:LON

Secondary:SYD

Page 29: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Schema Design Challenges

Page 30: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

First a story:

Once upon a time there was a medical records company…

Page 31: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Page 32: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Page 33: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Page 34: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Page 35: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Page 36: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Schema Design Challenge

• Flexibility– Easily adapt to new requirements

• Agility– Rapid application development

• Scalability– Support large data and query volumes

Page 37: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Schema Design:

MongoDB vs. Relational

Page 38: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

MongoDB Relational

Collections Tables

Documents Rows

Data Use Data Storage

What questions do I have?

What answers do I have?

MongoDB versus Relational

Page 39: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Attribute MongoDB Relational

Storage N-dimensional Two-dimensional

Field Values0, 1, many, or embed

Single value

QueryAny field or level

Any field

Schema Flexible Very structured

Updates In line In place

Page 40: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

With relational, this is hard

Long development times

Inflexible

Doesn’t scale

Page 41: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Document model is much easier

Shorter development times

Flexible

Scalable

{ "patient_id": "1177099", "first_name": "John", "last_name": "Doe", "middle_initial": "A", "dob": "2000-01-25", "gender": "Male", "blood_type": "B+", "address": "123 Elm St., Chicago, IL 59923", "height": "66", "weight": "110", "allergies": ["Nuts", "Penicillin", "Pet Dander"], "current_medications": [{"name": "Zoloft", "dosage": "2mg", "frequency": "daily", "route": "orally"}], "complaint" : [{"entered": "2000-11-03", "onset": "2000-11-03", "prob_desc": "", "icd" : 250.00, "status" : "Active"}, {"entered": "2000-02-04", "onset": "2000-02-04", "prob_desc": "in spite of regular exercise, ...", "icd" : 401.9, "status" : "Active"}], "diagnosis" : [{"visit" : "2005-07-22" , "narrative" : "Fractured femur", "icd" : "9999", "priority" : "Primary"}, {"visit" : "2005-07-22" , "narrative" : "Type II Diabetes", "icd" : "250.00", "priority" : "Secondary"}]}

Page 42: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Let’s model something together

How about a business card?

Page 43: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Business Card

Page 44: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Address Book Entity-Relationship

Contacts• name• company• title

Addresses

• type• street• city• state• zip_code

Phones• type• number

Emails• type• address

Thumbnails

• mime_type

• dataPortraits• mime_typ

e• data

Groups• name

N

1

N

1

N

N

N

1

1

1

11

Twitters• name• location• web• bio

1

1

Page 45: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Referencing

Contact

• name• compan

y• title• phone

Address

• street• city• state• zip_cod

e

Use two collections with a reference

Similar to relational

Page 46: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Contact

• name• company• address

• Street• City• State• Zip

• title• phone

• address• street• city• State• zip_cod

e

Embedding

Document Schema

Page 47: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Referencing

Contacts

{

“_id”: 2,

“name”: “Steven Jobs”,

“title”: “VP, New Product Development”,

“company”: “Apple Computer”,

“phone”: “408-996-1010”,

“address_id”: 1

}

Addresses

{“_id”: 1,“street”: “10260 Bandley Dr”,“city”: “Cupertino”,“state”: “CA”,“zip_code”: ”95014”,“country”: “USA”

}

Page 48: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

EmbeddingContacts

{

“_id”: 2,

“name”: “Steven Jobs”,

“title”: “VP, New Product Development”,

“company”: “Apple Computer”,

“address”: {“street”: “10260 Bandley Dr”,

“city”: “Cupertino”,

“state”: “CA”,

“zip_code”: ”95014”,

“country”: “USA”},

“phone”: “408-996-1010”

}

Page 49: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

How are they different? Why?

Contact

• name• compan

y• title• phone

Address

• street• city• state• zip_cod

e

Contact

• name• company• adress

• Street• City• State• Zip

• title• phone

• address• street• city• state• zip_cod

e

Page 50: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Schema Flexibility{

“name”: “Steven Jobs”,“title”: “VP, New Product

Development”,“company”: “Apple

Computer”,“address”: {

“street”: 10260 Bandley Dr”,

“city”: “Cupertino”,“state”: “CA”,“zip_code”:

“95014”},“phone”: “408-996-1010”

}

{“name”: “Larry Page,“url”: “http://google.com”,“title”: “CEO”,“company”: “Google!”,“address”: {

“street”: 555 Bryant, #106”,

“city”: “Palo Alto”,“state”: “CA”,“zip_code”:

“94301”},“phone”: “650-330-0100”“fax”: ”650-330-1499”

}

Page 51: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

One-to-many embedding vs. referencing

{ “name”: “Larry Page”, “url”: “http://google.com/”, “title”: “CEO”, “company”: “Google!”, “email”: “[email protected]”, “address”: [{ “street”: “555 Bryant, #106”, “city”: “Palo Alto”, “state”: “CA”, “zip_code”: “94301” }] “phones”: [{“type”: “Office”, “number”: “650-618-1499”}, {“type”: “fax”, “number”: “650-330-0100”}]}

{ “name”: “Larry Page”, “url”: “http://google.com/”, “title”: “CEO”, “company”: “Google!”, “email”: “[email protected]”, “address”: [“addr99”], “phones”: [“ph23”, “ph49”]}

{ “_id”: “addr99”, “street”: “555 Bryant, #106”, “city”: “Palo Alto”, “state”: “CA”, “zip_code”: “94301”}

{ “_id”: “ph23”, “type”: “Office”, “number”: “650-618-1499”},{ “_id”: “ph49”,

“type”: “fax”, “number”: “650-330-0100”}

Page 52: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Many to ManyTraditional Relational Association

Join tableContacts

namecompanytitlephone

Groupsname

GroupContacts

group_idcontact_idX

Use arrays instead

Page 53: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Address Book Entity-Relationship

Contacts• name• company• title

Addresses

• type• street• city• state• zip_code

Phones• type• number

Emails• type• address

Thumbnails

• mime_type

• dataPortraits• mime_typ

e• data

Groups• name

N

1

N

1

N

N

N

1

1

1

11

Twitters• name• location• web• bio

1

1

Page 54: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Contacts• name• company• title

addresses• type• street• city• state• zip_code

phones• type• number

emails• type• address

thumbnail• mime_type• data

Portraits• mime_type• data

Groups• name

N

1

N

1

twitter• name• location• web• bio

N

N

N

1

1

Document model - holistic and efficient representation

Page 55: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Contact document example{

“name” : “Gary J. Murakami, Ph.D.”,“company” : “MongoDB, Inc”,“title” : “Lead Engineer and Ruby Evangelist”,“twitter” : {

“name” : “GaryMurakami”, “location” : “New Providence, NJ”,“web” : “http://www.nobell.org”

},“portrait_id” : 1,“addresses” : [

{ “type” : “work”, “street” : ”229 W 43rd St.”, “city” : “New York”, “zip_code” : “10036” }

],“phones” : [

{ “type” : “work”, “number” : “1-866-237-8815 x8015” }],“emails” : [

{ “type” : “work”, “address” : “[email protected]” },{ “type” : “home”, “address” : “[email protected]” }

]}

Page 56: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Health Care Use Cases

Page 57: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

360-Degree Patient View

• Healthcare provider networks have massive amounts of patient data

– Both structured and unstructured– Basic patient informations– Lab results– MRI images

• Centralization of data needed– Aggregation of all the data in one repository

• Analytics

Page 58: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Population Management for At-Risk Demographics

• Certain populations are known to be prone to certain diseases.

• Analyzing data insurers help people take preventative measures

– reminding them to get regularly scheduled colonoscopies

• Help insurers to reduce costs and to expand margins,

Page 59: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Lab Data Management and Analytics

• Strain on traditional technological systems:– Rise of number of tests conducted– Rise of variety of data collected– Lack of flexibility

• With MongoDB’s flexible data model, providers of lab testing, genomics and clinical pathology can:

– Ingest, store and analyze a variety of data types– Coming from numerous sources all in a single data

store

• enables these companies to generate new insights and revenue streams

Page 60: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Other use cases for MongoDB in healthcare include:

• Fraud Detection

• Remote Monitoring and Body Area Networks

• Mobile Apps for Doctors and Nurses

• Pandemic Detection with Real-Time Geospatial Analytics

• Electronic Healthcare Records (EHR)

• Advanced Auditing Systems for Compliance

• Hospital Equipment Management and Optimization

Page 61: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

Thank You

Solutions Architect, MongoDB

Massimo [email protected]@massimobrignoli

#MongoDB

Page 62: tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When