how we fixed our mongodb problems

Post on 26-Jan-2015

107 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Eric Lubow

@elubow

elubow@simplereach.com

#MongoDBDays

How We Fixed OurMongoDB Problems

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

OverviewThe Secret

SimpleReach

Usage Patterns

Tools

Architecture Implementation

Questions

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

The 2 Truths

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Even with the right tools, 80% of the workof building a big data system is acquiringand refining the raw data into usable data.

The Real Truth

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Millions of URLs per dayOver 1.25 billion page views per month500m events per day (~6k events/second)Auto-scale 125-160 machines depending on trafficBuilt a predictive measurement algorithm for the social web

SimpleReach

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

And It Goes Like This...

C*Vertica

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Why Mongo?Fast and easy prototyping

Low barrier to entry

B-Tree indexes and range queries

Aggergation

Everything is JSON

TTLs

MongoID

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

GoalsHighly available

Speed

Repeatability

Data accuracy (across storage engines)

Clients should have minimal architecture knowledge

Controlled Data Flow Patterns

Control data set size

Restore capabilities for non-ephemeral data

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Availability and SpeedInternal service architecture

Mongos on every server that talks to Mongo

Server distribution across data centers

Latest version isn’t always the greatest version

Understand how usage patterns affect Mongo

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Repeatability - Sharded Replica SetSHARD0000A

MONGOS

PRIMARY SECONDARY

BASE AMI

ORGANIZATIONAL BASE

BASE IMAGE

LAYOUT

APPLICATION GROUP

AMAZONLINUX

MONITORING

USERS

MONGOD

MONGOD-ARBITER

SHARD0000B

MONGOS

AMAZONLINUX

MONITORING

USERS

MONGOD

APPLICATION

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Availability - Architecture DistributionUS-EAST-

1a

MONGO-SHARD-0001-B

MONGO-SHARD-0000-A

CASSANDRA-0001

CASSANDRA-0010

REDIS-0001A

VERTICA-0001

iAPI-0001

US-EAST-1b

MONGO-SHARD-0002-B

MONGO-SHARD-0001-A

CASSANDRA-0002

CASSANDRA-0011

REDIS-0001B

iAPI-0002

US-EAST-1e

MONGO-SHARD-0002-A

MONGO-SHARD-0000-B

CASSANDRA-0003

CASSANDRA-0012

VERTICA-0003

iAPI-0003

VERTICA-0002

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

The Schrute of the Problem

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

ReleasesReasons why I update software:

Because I want the latest version

To get rid of the reminder

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Usage PatternsMongos uses TCP-based flow control

Separate DBs to deal with DB level locking

Consistent access patterns

Schema design

Proper indexing

Avoid scatter/gather and aim for targeted

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Consistent Access Patterns

realtime_score(‘score’, ‘realtime’)

score.realtime

srt

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Schema DesignRandomly pre-populate consistent document structures

Use SetOnInsert to pre-populate

Shard keys

Separate DBs to deal with DB level locking (volume based)

TTL

Hashed shard keys

$inc when possible, $set is expensive

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Hourly Stats Documents{

"_id": BinData(5, "OWQ5NzQ0ZjgxZGUwYTdmMzM3Y2U0NDkzZGFlMGY0NTc="),

"account_id": ObjectId("5165905f4240cf9182000069"),

"hour": ISODate("2013-06-02T23:00:00Z"),

"content_id": "56250f88530ecc21233be5d2384679b2",

"totals": {

"facebook_likes": 0,

"facebook_shares": 1,

"facebook_referrals": 0,

"pageviews": 10134,

"twitter_tweets": 16,

"twitter_referrals": 3045,

"social_actions": 17,

"social_referrals": 3045

}

}

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Daily Stats Documents{

"_id": BinData(5, "OWQ5NzQ0ZjgxZGUwYTdmMzM3Y2U0NDkzZGFlMGY0NTc="),

"account_id": ObjectId("5165905f4240cf9182000069"),

"day": ISODate("2013-06-02T00:00:00Z"),

"content_id": "56250f88530ecc21233be5d2384679b2",

"totals": {

"pageviews": 10134,

"twitter_tweets": 16,

"social_actions": 17

},

"00": {

"pageviews": 283,

"twitter_tweets": 10,

"social_actions": 10

},

"01": {

"pageviews": 9851,

"twitter_tweets": 6,

"social_actions": 6

}

}

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Path of a Packet

INTERNET

Inte

rnal

API

Solr

C*

Mongo

Redis

VerticaCo

nsum

ers

Que

ue

FIREHOSE

EC

API

SC

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

NSQ by Bit.lyDistributed and de-centralized topology

At least once delivery guaranteed

Multicast style message routing

Runtime discovery for consumers to find producers

Allow for maintenance windows with no downtime

Ephemeral channels for testing

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Controlled Data Flow

Social EventCollector

Social Data

Batch & WriteProcessed Data

Batch & WriteRaw Data

Calculate Score Write

NSQ Multicast NSQ NSQ

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Problems?

Big Architectures for Big Data Eric Lubow @elubow #Cassandra13

Service Architecture

Internal API

Solr

Real-time

C*

C*

Vertica

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Anatomy of an Endpoint

MONGO

MONGO

VERTICA

C*

C*

HO

URL

YCO

NTE

NT MONGO

MONGO

VERTICA

C*

C*TEN

MIN

UTE

CON

TEN

T

QU

ERY

ING

MA

CHIN

ES

HELENUS

HELENUS

PYVERTICA

PYMONGO

PYMONGO

PYVERTICA

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Endpoint Breakout AdvantagesAvailability

Consistent Access Patterns

Minimal downtime changes

Smaller code deploys

Non-monolithic code base

No async necessary

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

DevOpsMonitor: Nagios, Statsd, and Cloudwatch

Manage: Chef, OpsWorks, cSSHx, Vagrant

Know failure cases

Turn off balancer on backups

Restart EVERYTHING on upgrade

Extensive use of AWS

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Cloud Specificsblockdev --setra 256

Use ephemeral storage, not EBS volumes

Use MMS

Cloudwatch Metrics are important and easily scriptable

Don’t use spots but always expect instance loss

Kernel tuning

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

SummaryUnderstand your usage patterns

Know the common failure cases

Architecture distribution

Homogeneous Distribution

Monitoring & Automation

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

We’reHiring

(Ask about Food ComaFridays)

How We Fixed Our MongoDB

Problems

Eric Lubow @elubow #MongoDBDays

Questions are guaranteed in life.Answers aren’t.

Eric Lubow

@elubow

elubow@simplereach.com

#Cassandra13

Thank you.

top related