Download - Performance Tuning on the Fly at CMP.LY
![Page 1: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/1.jpg)
1JUNE 2014
Performance Tuning on the Fly at CMP.LY
Michael De Lorenzo
CTO, CMP.LY Inc.
@mikedelorenzo
![Page 2: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/2.jpg)
2JUNE 2014
Agenda• CMP.LY and CommandPost
• What is MongoDB Management Service?
• Performance Tuning
• MongoDB Issues we’ve faced
• Slow response times and delayed writes
• Unindexed queries
• Increased Replication Lag and Plummeting oplog Window
• Keep your deployment healthy with MMS
• Using MMS Alerts
• Using MMS Backups
![Page 3: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/3.jpg)
3JUNE 2014
A venture-funded NYC startup that offers proprietary social media, monitoring,
measurement, insight and compliance solutions for Fortune 100
A Monitoring, Measurement & Insights (MMI) tool for managed social
communications.
![Page 4: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/4.jpg)
4JUNE 2014
Use CommandPost to:• Track and measure cross-platform in real-time
• Identify and attribute high-value engagement
• Analyze and segment engaged audience
• Optimize content and engagement strategies
• Address compliance needs
![Page 5: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/5.jpg)
5JUNE 2014
What is MongoDB
Management Service?
![Page 6: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/6.jpg)
6JUNE 2014
MongoDB Management Service• Free MongoDB Monitoring
• MongoDB Backup in the Cloud
• Free Cloud service or Available
to run On-Prem for Standard or
Enterprise Subscriptions
• Automation coming soon—FTW!
Ops
Makes MongoDB easier to use and
manage
![Page 7: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/7.jpg)
7JUNE 2014
Who Is MMS for?• Developers
• Ops Team
• MongoDB Technical Service Team
![Page 8: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/8.jpg)
8JUNE 2014
Performance Tuning
![Page 9: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/9.jpg)
9JUNE 2014
How To Do Performance Tuning?• Assess the problem and establish acceptable behavior.
• Measure the performance before modification.
• Identify the bottleneck.
• Remove the bottleneck.
• Measure performance after modification to confirm.
• Keep it or revert it and repeat.
Adapted from [http://en.wikipedia.org/wiki/Performance_tuning]
![Page 10: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/10.jpg)
10JUNE 2014
What We’ve Faced
![Page 11: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/11.jpg)
11JUNE 2014
Issues We’ve Faced• Concurrency Issues
• Slow response times and delayed writes
• Querying without indexes
• Slow reads, timeouts
• Increasing Replication Lag + Plummeting oplog Window
![Page 12: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/12.jpg)
12JUNE 2014
Concurrency
Slow responses and delayed writes
![Page 13: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/13.jpg)
13JUNE 2014
Concurrency• What is it?
• How did it affect us?
• How did MMS help identify it?
• How did we diagnose the issue in our app and fix it?
• Today
![Page 14: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/14.jpg)
14JUNE 2014
Concurrency in MongoDB• MongoDB uses a readers-writer lock
• Many read operations can use a read lock
• If a write lock exists, a single write lock holds the lock exclusively
• No other read or write operations can share the lock
• Locks are “writer-greedy”
![Page 15: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/15.jpg)
15JUNE 2014
How Did This Affect Us?• Slow API response times due to slow database operations
• Delayed writes
• Backed up queues
![Page 16: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/16.jpg)
16JUNE 2014
MMS: Identify Concurrency Issues
![Page 17: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/17.jpg)
17JUNE 2014
Lock % Greater than 100%?!?!?• time spent in write lock state; sum of global lock + hottest database at that time,
can make value > 100%
• Global lock percentage is a derived metric:
% of time in global lock (small number)
+% of time locked by hottest (“most locked”) database
• Data is sampled and combined, it is possible to see values over 100%.
![Page 18: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/18.jpg)
18JUNE 2014
Diagnosis• Identified the write-heavy collections in our applications
• Used application logs to identify slow API responses
• Analyzed MongoDB logs to identify slow database queries
![Page 19: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/19.jpg)
19JUNE 2014
Our Remedies• Schema changes
• Message queues
• Multiple databases
• Sharding
![Page 20: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/20.jpg)
20JUNE 2014
Schema Changes• Denormalized our schema
• Allowed for atomic updates
• Customized documents’ _id attribute
• Leveraged existing index on _id attribute
![Page 21: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/21.jpg)
21JUNE 2014
Modeling for Atomic OperationsDocument{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf"
],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly",
available: 3,
checkout: [ { by: "joe", date:
ISODate("2012-10-15") } ]
}
Update Operationdb.books.update (
{ _id: 123456789, available: { $gt: 0 } },
{
$inc: { available: -1 },
$push: { checkout: { by: "abc", date: new
Date() } }
}
)
ResultWriteResult({ "nMatched" : 1, "nUpserted" : 0,
"nModified" : 1 })
![Page 22: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/22.jpg)
22JUNE 2014
Message Queues• Controlled writes to specific collections using Pub/Sub
• We chose Amazon SQS
• Other options include Redis, Beanstalkd, IronMQ or any other message queue
• Created consistent flow of writes versus bursts
• Reduced length and frequency of write locks by controlling flow/speed of writes
![Page 23: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/23.jpg)
23JUNE 2014
Using Multiple Databases• As of version 2.2, MongoDB implements locks at a per database granularity for
most read and write operations
• Planned to be at the document level in version 2.8
• Moved write-heavy collections to new (separate) databases
![Page 24: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/24.jpg)
24JUNE 2014
Using Sharding• Improves concurrency by distributing databases across multiple mongod
instances
• Locks are per-mongod instance
![Page 25: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/25.jpg)
25JUNE 2014
Lock %: Today
![Page 26: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/26.jpg)
26JUNE 2014
Queries without Indexes
Slow responses and timeouts
![Page 27: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/27.jpg)
27JUNE 2014
Indexing• What is it?
• How did it affect us?
• How did MMS help identify it?
• How did we diagnose the issue in our app and fix it?
• Today
![Page 28: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/28.jpg)
28JUNE 2014
Indexing with MongoDB• Support for efficient execution of queries
• Without indexes, MongoDB must scan every document
• Example
Wed Jul 17 13:40:14 [conn28600] query x.y [snip] ntoreturn:16 ntoskip:0 nscanned:16779 scanAndOrder:1 keyUpdates:0 numYields: 906 locks(micros) r:46877422 nreturned:16 reslen:6948 38172ms
38 seconds! Scanned 17k documents, returned 16
• Create indexes to cover all queries, especially support common and user-facing
• Collection scans can push entire working set out of RAM
![Page 29: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/29.jpg)
29JUNE 2014
How Did this Affect Us?• Our web apps became slow
• Queries began to timeout
• Longer operations mean longer lock times
![Page 30: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/30.jpg)
30JUNE 2014
MMS: Identifying Indexing IssuesPage Faults
• The number of times that
MongoDB requires data
not located in physical
memory, and must read
from virtual memory.
![Page 31: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/31.jpg)
31JUNE 2014
Diagnosis• Log Analysis
• Use mtools to analyze MongoDB logs
• mlogfilter• filter logs for slow queries, collection scans, etc.
• mplotqueries• graph query response times and volumes
• https://github.com/rueckstiess/mtools
![Page 32: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/32.jpg)
32JUNE 2014
Diagnosis• Monitoring application logs
• Enabling ‘notablescan’ option in development and testing versions of apps
• MongoDB profiling
![Page 33: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/33.jpg)
33JUNE 2014
The MongoDB Profiler• Collects fine grained data about MongoDB write operations, cursors, database
commands on a running mongod instance.
• Default slowOpThreshold value is 100ms, can be changed from the Mongo shell
![Page 34: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/34.jpg)
34JUNE 2014
Our Remedies• Add indexes!
• Make sure queries are covered
• Utilize the projection specification to limit fields (data) returned
![Page 35: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/35.jpg)
35JUNE 2014
Adding Indexes• Improved performance for common queries
• Alleviates the need to go to disk for many operations
![Page 36: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/36.jpg)
36JUNE 2014
Projection SpecificationControls the amount of data that needs to be (de-)serialized for use in your app
• We used it to limit data returned in embedded documents and arrays
db.inventory.find( { type: 'food' }, { item: 1, qty: 1 } )
![Page 37: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/37.jpg)
37JUNE 2014
Page Faults: Today
![Page 38: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/38.jpg)
38JUNE 2014
Increasing Replication Lag + Plummeting oplog Window
![Page 39: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/39.jpg)
39JUNE 2014
Replication• What is it?
• How did it affect us?
• How did MMS help identify it?
• How did we diagnose the issue in our app?
• How did we fix it?
• Today
![Page 40: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/40.jpg)
40JUNE 2014
What is Replication?• A replica set is a group of mongod
processes that maintain the same data
set.
• Replica sets provide redundancy and
high availability, and are the basis for all
production deployments
![Page 41: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/41.jpg)
41JUNE 2014
What Is the Oplog?• A special capped collection that keeps a rolling record of all operations that
modify the data stored in your databases.
• Operations are first applied on the primary and then recorded to its oplog.
• Secondary members then copy and apply these operations in an asynchronous
process.
![Page 42: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/42.jpg)
42JUNE 2014
What is Replication Lag?• A delay between an operation on the primary and the application of that
operation from the oplog to the secondary.
• Effects of excessive lag
• “Lagged” members ineligible to quickly become primary
• Increases the possibility that distributed read operations will be inconsistent.
![Page 43: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/43.jpg)
43JUNE 2014
How did this affect us?• Degraded overall health of our production deployment.
• Distributed reads are no longer eventually consistent.
• Unable to bring new secondary members online.
• Caused MMS Backups to do full re-syncs.
![Page 44: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/44.jpg)
44JUNE 2014
Identifying Replication Lag Issues with MMSThe Replication Lag chart displays the lag for your deployment
![Page 45: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/45.jpg)
45JUNE 2014
Diagnosis• Possible causes of replication lag include network latency, disk throughput,
concurrency and/or appropriate write concern
• Size of operations to be replicated
• Confirmed Non-Issues for us
• Network latency
• Disk throughput
• Possible Issues for us
• Concurrency/write concern
• Size of op is an issue because entire document is written to oplog
![Page 46: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/46.jpg)
46JUNE 2014
Concurrency/Write Concern• Our applications apply many updates very quickly
• All operations need to be replicated to secondary members
• We use the default write concern—Acknowledge
• The mongod confirms receipt of the write operation
• Allows clients to catch network, duplicate key and other errors
![Page 47: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/47.jpg)
47JUNE 2014
Concurrency Wasn’t the IssueLock Percentage
![Page 48: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/48.jpg)
48JUNE 2014
Operation Size Was the IssueCollection A (most active)
Total Updates: 3,373
Total Size of updates: 6.5 GB
Activity accounted for nearly 87% of total traffic
Collection B (next most active)
Total Updates: 85,423
Total Size of updates: 740 MB
![Page 49: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/49.jpg)
49JUNE 2014
Fast Growing oplog causes issuesReplication oplog Window – approximate hours available in the primary’s oplog
![Page 50: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/50.jpg)
50JUNE 2014
How We Fixed It• Changed our schema
• Changed the types of updates that were made to documents
• Both allowed us to utilize atomic operations
• Led to smaller updates
• Smaller updates == less oplog space used
![Page 51: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/51.jpg)
51JUNE 2014
Replication Lag: Today
![Page 52: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/52.jpg)
52JUNE 2014
oplog Window: Today
![Page 53: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/53.jpg)
53JUNE 2014
Keeping Your Deployment Healthy
![Page 54: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/54.jpg)
54JUNE 2014
MMS Alerts
![Page 55: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/55.jpg)
55JUNE 2014
Watch for Warnings• Be warned if you are
• Running outdated versions
• Have startup warnings
• If a mongod is publicly visible
• Pay attention to these warnings
![Page 56: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/56.jpg)
56JUNE 2014
MMS Backups• Engineered by MongoDB
• Continuous backup with point-in-time recovery
• Fully managed backups
![Page 57: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/57.jpg)
57JUNE 2014
Using MMS Backups• Seeding new secondaries
• Repairing replica set members
• Development and testing databases
• Restores are free!
![Page 58: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/58.jpg)
58JUNE 2014
Summary• Know what’s expected and “normal” in your systems
• Know when and what changes in your systems
• Utilize MMS alerts, visualizations and warnings to keep things running smoothly
![Page 59: Performance Tuning on the Fly at CMP.LY](https://reader033.vdocument.in/reader033/viewer/2022051315/55a9b43e1a28abd2698b46b8/html5/thumbnails/59.jpg)
59JUNE 2014
Questions?
Michael De Lorenzo
CTO, CMP.LY Inc.
@mikedelorenzo