silicon valley code camp 2015 - advanced mongodb - the sequel

41
Advanced MongoDB for Development, Deployment and Operation - The Sequel Daniel Coupal Technical Services Engineer, Palo Alto, CA #MongoDB Silicon Valley Code Camp 2015

Upload: daniel-coupal

Post on 28-Jan-2018

463 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

Advanced MongoDB for Development, Deployment and Operation - The Sequel Daniel Coupal Technical Services Engineer, Palo Alto, CA

#MongoDB

Silicon Valley Code Camp 2015

Page 2: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

2

• Making you successful in developing, deploying and operating an application with MongoDB

•  I do expect you to know the basics of MongoDB.

• …even better if you already have an application about to be deployed or deployed

This presentation is about …

Page 3: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

3

I hope you walk out of this presentation and you make at least one single change in your application, deployment, configuration, etc that will prevent one issue from happening.

My Goal

Page 4: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

4

1.  The Story of MongoDB

2.  The Story of your Application Chapter 1: Prototype and Development

Chapter 2: Deployment

Chapter 3: Operation

3.  Wrapping up

4.  Q&A

Agenda

Page 5: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

1. The Story of MongoDB

Page 6: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

6

The Sun was shinning on the land of the Oracle …

Once upon a time

Page 7: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

7

Then came the Web

Page 8: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

8

We’re gonna need a bigger database

Page 9: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

9

•  Originaly, 10gen

– Founded in 2007

– Released MongoDB 1.0 in 2009

•  MongoDB Inc

– Since 2013

– Acquired WiredTiger in 2014

•  MongoDB

– Open source software

– Contributions on tools, drivers, …

– Most popular NoSQL database

MongoDB - Timeline

Page 10: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

10

MongoDB - Company Overview

450+ employees 1,000+ customers

Over $300 million in funding 30+ offices around the world

Page 11: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

11

Positions open in Palo Alto, Austin and NYC

• http://www.mongodb.com/careers/positions

Technical service engineers in Palo Alto

• MongoDB

• MongoDB Tools

• Proactive support

MongoDB - We hire!

Page 12: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

2. The Story of your Application

Page 13: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

13

1.  Schema, schema, schema!

2.  Incorporate testability in your application

3.  Think about data sizing and growth

4.  What happens when a failure is returned by the database?

5.  Index correctly

6.  Performance Tuning

Chapter 1 - Prototype and Development

Page 14: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

14

•  Relational world 1.  Model your data

2.  Write the application against your data

•  NoSQL world 1.  Define what you do want to do with the data

Ø What are your queries? 2.  Model your data

Schema, schema, schema

Page 15: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

15

•  Test Driven Development

•  Ask yourself, how can I test that this piece is working

•  TIP: MongoDB does not need a schema and it creates databases and collections (tables) on the fly

– Incorporate username, hostname, timestamps in database names for unit tests

Incorporate testability in the application

Page 16: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

16

•  How much data will you have initially? •  How will your data set grow over time? •  How big is your working set? •  Will you be loading huge bulk inserts, or have a

constant stream of writes? •  How many reads and writes will you need to

service per second? •  What is the peak load you need to provision for?

Think about data sizing and growth

Page 17: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

17

•  Good model and understanding of latencies, write concerns

•  Catch exceptions

•  Retries

•  …

What happens when a failure is returned by the database?

Page 18: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

18

•  More than 50% of the customer issues

•  Collection Scan

– Very bad if you have a large collection

– One of the main performance issue see in our customers’ application

– Can be identified in the logs with the ‘nscannedObjects’ attribute on slow queries

•  Watch out for updates to the Application

Index correctly

Page 19: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

19

1.  Assess the problem and establish acceptable behavior

2.  Measure the current performance

3.  Find the bottleneck*

4.  Remove the bottleneck

5.  Re-test to confirm

6.  Repeat

* - (This is often the hard part)

(Adapted from http://en.wikipedia.org/wiki/Performance_tuning )

Performance Tuning

Page 20: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

20

1.  Deployment topology

2.  Have a test/staging environment – Track slow queries and collection scans

3.  MongoDB production notes –  http://docs.mongodb.org/manual/administration/production-notes

4.  Storage considerations

5.  Host considerations

Chapter 2 - Deploy

Page 21: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

21

•  Sharding or not?

•  3 data nodes per replica set or 2 data nodes + arbiter?

•  Many Data Centers or availability zones

•  What is important for you?

– Durability of writes

– Performance

=> can be chosen per operation

Deployment topology

Page 22: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

22

•  Best if it the capacity matches the production deployment

– Otherwise, if prod is 20 shards x 3 nodes, you can have 2 x 3 nodes, or 20 x 1 node

•  Data size should be representative

–  Start with simulated data

–  Use backup of production data

•  Disable table/collection scans or scan the logs for them

Have a test/staging environment

Page 23: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

23

•  Most important documentation of MongoDB

–  http://docs.mongodb.org/v3.0/administration/production-notes/

•  Security checklist

–  Authentication, limit network exposure, … audit system activity

•  Allocate Sufficient RAM and CPU

•  MongoDB and NUMA Hardware

•  Platform Specific Considerations

–  Turn off atime for the storage volume containing the database files.

–  Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000

–  MongoDB on Virtual Environments

MongoDB Production notes

Page 24: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

24

•  RAID

=> 0+1 or None

•  HDD or SSD

=> SSD, if budget permit

•  NAS, SAN or Direct Attached?

=> Direct Attached, good news those are the cheapest!

•  File System type

MMAPv1 => ext4 WiredTiger => xfs

•  Settings

=> ReadAhead

Storage considerations

Page 25: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

25

• CPU power? – MMAPv1 => no

– WiredTiger => yes, needed for compression

• RAM? – Yes! RAM is always an order of magnitude faster than

disk

Host considerations

Page 26: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

26

1.  Monitor

2.  Upgrade

3.  Backup

4.  Troubleshoot

Chapter 3 - Operation

Page 27: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

27

“Shit will happen!”

• Are you prepared?

• Have backups?

• Have a good picture of your “normal state”

Disaster will strike

Page 28: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

28

•  iostat, top, vmstat, sar

• mongostat, mongotop

• CloudManager/OpsManager Monitoring – plus Munin extensions

Monitor

Page 29: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

29

• Major versions have same binary format, same protocol, etc for each new minor version

• Major versions have upgrade and downgrade paths

• CloudManager and OpsManager Automation handles automatic upgrades

Upgrade

Page 30: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

30

Mongodump File system CloudManager Backup

OpsManager Backup

Initial complexity Medium High Low High

System overhead High Low Low Medium

Point in time recovery of replica set

No * No * Yes Yes

Consistent snapshot of sharded system

No * No * Yes Yes

Scalable No Yes Yes Yes

Restore time Slow Fast Medium Medium

Comparing MongoDB backup approaches

* Possible, but need to write the tools and go though a lot of testing

Page 31: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

31

• mtools – https://github.com/rueckstiess/mtools/wiki

(just Google: github mtools)

Troubleshoot

Page 32: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

32

namespace pattern count min (ms) max (ms) mean (ms) sum (ms)

serverside.scrum_master {"datetime_used": {"$ne": 1}} 20 15753 17083 16434 328692

serverside.django_session {"_id": 1} 562 101 1512 317 178168

serverside.user {"_types": 1, "emails.email": 1} 804 101 1262 201 162311

local.slaves {"_id": 1, "host": 1, "ns": 1} 131 101 1048 310 40738

serverside.email_alerts {"_types": 1, "email": 1, "pp_user_id": 1} 13 153 11639 2465 32053

serverside.sign_up {"_id": 1} 77 103 843 269 20761

serverside.user_credits {"_id": 1} 6 204 900 369 2218

serverside.counters {"_id": 1, "_types": 1} 8 121 500 263 2111

serverside.auth_sessions {"session_key": 1} 7 111 684 277 1940

serverside.credit_card {"_id": 1} 5 145 764 368 1840

serverside.email_alerts {"_types": 1, "request_code": 1} 6 143 459 277 1663

serverside.user {"_id": 1, "_types": 1} 5 153 427 320 1601

serverside.user {"emails.email": 1} 2 218 422 320 640

serverside.user {"_id": 1} 2 139 278 208 417

serverside.auth_sessions {"session_endtime": 1, "session_userid": 1} 1 244 244 244 244

serverside.game_level {"_id": 1} 1 104 104 104 104

Troubleshoot – Slow Queries

Page 33: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

33

•  Interactive graph!

Troubleshoot – Slow Queries Plot

Page 34: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

3. Wrapping up

Page 35: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

35

1.  Missing indexes

2.  Not testing before deploying application changes

3.  OS settings

4.  Appropriate schema

5.  Hardware

6.  Not seeking help early enough

Common Mistakes

Page 36: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

36

•  MongoDB on-line presentations mongodb.com/presentations

•  Free MongoDB classes university.mongodb.com

•  mtools to analyze MongoDB logs github.com/rueckstiess/mtools

•  CloudManager cloud.mongodb.com

Free On-Line Resources

Page 37: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

37

•  MongoDB Support

– 24x7 support

– the sun never set on MongoDB Customer Support Team

•  MongoDB Consulting Days

•  MongoDB World (@NYC on June 28-29, 2016)

•  MongoDB Days (@SanJose on Dec 3, 2015)

Resources

Page 38: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

38

• Use available resources

• Testing – Plan for it, plan resources for it, do it before deploying

in a Test or Staging environment

Summary

Page 39: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

39

I hope you walk out of this presentation and you make at least one single change in your application, deployment, configuration, etc that will prevent one issue from happening.

Take away

Page 40: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

4. Q&A

Page 41: Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel

41

Positions open in Palo Alto, Austin and NYC

•  http://www.mongodb.com/careers

MongoDB for Giant Ideas