how we fixed our mongodb problems

34
Eric Lubow @elubow [email protected] #MongoDBDays How We Fixed Our MongoDB Problems

Upload: mongodb

Post on 26-Jan-2015

107 views

Category:

Technology


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: How We Fixed Our MongoDB Problems

Eric Lubow

@elubow

[email protected]

#MongoDBDays

How We Fixed OurMongoDB Problems

Page 2: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

OverviewThe Secret

SimpleReach

Usage Patterns

Tools

Architecture Implementation

Questions

Page 3: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Page 4: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

The 2 Truths

Page 5: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Even with the right tools, 80% of the workof building a big data system is acquiringand refining the raw data into usable data.

The Real Truth

Page 6: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Page 7: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Page 8: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Millions of URLs per dayOver 1.25 billion page views per month500m events per day (~6k events/second)Auto-scale 125-160 machines depending on trafficBuilt a predictive measurement algorithm for the social web

SimpleReach

Page 9: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

And It Goes Like This...

C*Vertica

Page 10: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Page 11: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Why Mongo?Fast and easy prototyping

Low barrier to entry

B-Tree indexes and range queries

Aggergation

Everything is JSON

TTLs

MongoID

Page 12: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

GoalsHighly available

Speed

Repeatability

Data accuracy (across storage engines)

Clients should have minimal architecture knowledge

Controlled Data Flow Patterns

Control data set size

Restore capabilities for non-ephemeral data

Page 13: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Availability and SpeedInternal service architecture

Mongos on every server that talks to Mongo

Server distribution across data centers

Latest version isn’t always the greatest version

Understand how usage patterns affect Mongo

Page 14: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Repeatability - Sharded Replica SetSHARD0000A

MONGOS

PRIMARY SECONDARY

BASE AMI

ORGANIZATIONAL BASE

BASE IMAGE

LAYOUT

APPLICATION GROUP

AMAZONLINUX

MONITORING

USERS

MONGOD

MONGOD-ARBITER

SHARD0000B

MONGOS

AMAZONLINUX

MONITORING

USERS

MONGOD

APPLICATION

Page 15: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Availability - Architecture DistributionUS-EAST-

1a

MONGO-SHARD-0001-B

MONGO-SHARD-0000-A

CASSANDRA-0001

CASSANDRA-0010

REDIS-0001A

VERTICA-0001

iAPI-0001

US-EAST-1b

MONGO-SHARD-0002-B

MONGO-SHARD-0001-A

CASSANDRA-0002

CASSANDRA-0011

REDIS-0001B

iAPI-0002

US-EAST-1e

MONGO-SHARD-0002-A

MONGO-SHARD-0000-B

CASSANDRA-0003

CASSANDRA-0012

VERTICA-0003

iAPI-0003

VERTICA-0002

Page 16: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

The Schrute of the Problem

Page 17: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

ReleasesReasons why I update software:

Because I want the latest version

To get rid of the reminder

Page 18: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Usage PatternsMongos uses TCP-based flow control

Separate DBs to deal with DB level locking

Consistent access patterns

Schema design

Proper indexing

Avoid scatter/gather and aim for targeted

Page 19: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Consistent Access Patterns

realtime_score(‘score’, ‘realtime’)

score.realtime

srt

Page 20: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Schema DesignRandomly pre-populate consistent document structures

Use SetOnInsert to pre-populate

Shard keys

Separate DBs to deal with DB level locking (volume based)

TTL

Hashed shard keys

$inc when possible, $set is expensive

Page 21: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Hourly Stats Documents{

"_id": BinData(5, "OWQ5NzQ0ZjgxZGUwYTdmMzM3Y2U0NDkzZGFlMGY0NTc="),

"account_id": ObjectId("5165905f4240cf9182000069"),

"hour": ISODate("2013-06-02T23:00:00Z"),

"content_id": "56250f88530ecc21233be5d2384679b2",

"totals": {

"facebook_likes": 0,

"facebook_shares": 1,

"facebook_referrals": 0,

"pageviews": 10134,

"twitter_tweets": 16,

"twitter_referrals": 3045,

"social_actions": 17,

"social_referrals": 3045

}

}

Page 22: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Daily Stats Documents{

"_id": BinData(5, "OWQ5NzQ0ZjgxZGUwYTdmMzM3Y2U0NDkzZGFlMGY0NTc="),

"account_id": ObjectId("5165905f4240cf9182000069"),

"day": ISODate("2013-06-02T00:00:00Z"),

"content_id": "56250f88530ecc21233be5d2384679b2",

"totals": {

"pageviews": 10134,

"twitter_tweets": 16,

"social_actions": 17

},

"00": {

"pageviews": 283,

"twitter_tweets": 10,

"social_actions": 10

},

"01": {

"pageviews": 9851,

"twitter_tweets": 6,

"social_actions": 6

}

}

Page 23: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Path of a Packet

INTERNET

Inte

rnal

API

Solr

C*

Mongo

Redis

VerticaCo

nsum

ers

Que

ue

FIREHOSE

EC

API

SC

Page 24: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

NSQ by Bit.lyDistributed and de-centralized topology

At least once delivery guaranteed

Multicast style message routing

Runtime discovery for consumers to find producers

Allow for maintenance windows with no downtime

Ephemeral channels for testing

Page 25: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Controlled Data Flow

Social EventCollector

Social Data

Batch & WriteProcessed Data

Batch & WriteRaw Data

Calculate Score Write

NSQ Multicast NSQ NSQ

Page 26: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Problems?

Page 27: How We Fixed Our MongoDB Problems

Big Architectures for Big Data Eric Lubow @elubow #Cassandra13

Service Architecture

Internal API

Solr

Real-time

C*

C*

Vertica

Page 28: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Anatomy of an Endpoint

MONGO

MONGO

VERTICA

C*

C*

HO

URL

YCO

NTE

NT MONGO

MONGO

VERTICA

C*

C*TEN

MIN

UTE

CON

TEN

T

QU

ERY

ING

MA

CHIN

ES

HELENUS

HELENUS

PYVERTICA

PYMONGO

PYMONGO

PYVERTICA

Page 29: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Endpoint Breakout AdvantagesAvailability

Consistent Access Patterns

Minimal downtime changes

Smaller code deploys

Non-monolithic code base

No async necessary

Page 30: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

DevOpsMonitor: Nagios, Statsd, and Cloudwatch

Manage: Chef, OpsWorks, cSSHx, Vagrant

Know failure cases

Turn off balancer on backups

Restart EVERYTHING on upgrade

Extensive use of AWS

Page 31: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Cloud Specificsblockdev --setra 256

Use ephemeral storage, not EBS volumes

Use MMS

Cloudwatch Metrics are important and easily scriptable

Don’t use spots but always expect instance loss

Kernel tuning

Page 32: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

SummaryUnderstand your usage patterns

Know the common failure cases

Architecture distribution

Homogeneous Distribution

Monitoring & Automation

Page 33: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

We’reHiring

(Ask about Food ComaFridays)

Page 34: How We Fixed Our MongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Questions are guaranteed in life.Answers aren’t.

Eric Lubow

@elubow

[email protected]

#Cassandra13

Thank you.