replication and replica sets

42
Senior Solutions Architect, 10gen James Kerr #MongoDBDays Replication and Replica Sets

Upload: mongodb

Post on 25-May-2015

697 views

Category:

Documents


2 download

DESCRIPTION

MongoDB supports replication for failover and redundancy. In this session we will introduce the basic concepts around replica sets, which provide automated failover and recovery of nodes. We'll show you how to set up, configure, and initiate a replica set, and methods for using replication to scale reads. We'll also discuss proper architecture for durability.

TRANSCRIPT

Page 1: Replication and Replica Sets

Senior Solutions Architect, 10gen

James Kerr

#MongoDBDays

Replication and Replica Sets

Page 2: Replication and Replica Sets

Agenda

• Replica Sets Lifecycle

• Developing with Replica Sets

• Operational Considerations

• Behind the Curtain

Page 3: Replication and Replica Sets

Why Replication?

• How many have faced node failures?

• How many have been woken up from sleep to do a fail-over(s)?

• How many have experienced issues due to network latency?

• Different uses for data– Normal processing– Simple analytics

Page 4: Replication and Replica Sets

ReplicaSet Lifecycle

Page 5: Replication and Replica Sets

Replica Set – Creation

Page 6: Replication and Replica Sets

Replica Set – Initialize

Page 7: Replication and Replica Sets

Replica Set – Failure

Page 8: Replication and Replica Sets

Replica Set – Failover

Page 9: Replication and Replica Sets

Replica Set – Recovery

Page 10: Replication and Replica Sets

Replica Set – Recovered

Page 11: Replication and Replica Sets

ReplicaSet Roles & Configuration

Page 12: Replication and Replica Sets

Replica Set Roles

Page 13: Replication and Replica Sets

> conf = {

_id : "mySet",

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Configuration Options

Page 14: Replication and Replica Sets

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Configuration Options

Primary DC

Page 15: Replication and Replica Sets

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Configuration Options

Secondary DCDefault Priority = 1

Page 16: Replication and Replica Sets

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Configuration Options

Analytics

node

Page 17: Replication and Replica Sets

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Configuration Options

Backup node

Page 18: Replication and Replica Sets

Developing with Replica Sets

Page 19: Replication and Replica Sets

Strong Consistency

Page 20: Replication and Replica Sets

Delayed Consistency

Page 21: Replication and Replica Sets

Write Concern

• Network acknowledgement

• Wait for error

• Wait for journal sync

• Wait for replication

Page 22: Replication and Replica Sets

Unacknowledged

Page 23: Replication and Replica Sets

MongoDB Acknowledged (wait for error)

Page 24: Replication and Replica Sets

Wait for Journal Sync

Page 25: Replication and Replica Sets

Wait for Replication

Page 26: Replication and Replica Sets

Tagging

• New in 2.0.0

• Control where data is written to, and read from

• Each member can have one or more tags– tags: {dc: "ny"}– tags: {dc: "ny", subnet: "192.168", rack:

"row3rk7"}

• Replica set defines rules for write concerns

• Rules can change without changing app code

Page 27: Replication and Replica Sets

{

_id : "mySet",

members : [

{_id : 0, host : "A", tags : {"dc": "ny"}},

{_id : 1, host : "B", tags : {"dc": "ny"}},

{_id : 2, host : "C", tags : {"dc": "sf"}},

{_id : 3, host : "D", tags : {"dc": "sf"}},

{_id : 4, host : "E", tags : {"dc": "cloud"}}],

settings : {

getLastErrorModes : {

allDCs : {"dc" : 3},

someDCs : {"dc" : 2}} }

}

> db.blogs.insert({...})

> db.runCommand({getLastError : 1, w : "someDCs"})

Tagging Example

Page 28: Replication and Replica Sets

Wait for Replication (Tagging)

Page 29: Replication and Replica Sets

Read Preference Modes

• 5 modes (new in 2.2)– primary (only) - Default– primaryPreferred– secondary– secondaryPreferred– Nearest

When more than one node is possible, closest node is used for reads (all modes but primary)

Page 30: Replication and Replica Sets

Tagged Read Preference

• Custom read preferences

• Control where you read from by (node) tags– E.g. { "disk": "ssd", "use": "reporting" }

• Use in conjunction with standard read preferences– Except primary

Page 31: Replication and Replica Sets

Operational Considerations

Page 32: Replication and Replica Sets

Maintenance and Upgrade

• No downtime

• Rolling upgrade/maintenance– Start with Secondary– Primary last

Page 33: Replication and Replica Sets

Replica Set – 1 Data Center

• Single datacenter

• Single switch & power

• Points of failure:– Power– Network– Data center– Two node failure

• Automatic recovery of single node crash

Page 34: Replication and Replica Sets

Replica Set – 2 Data Centers

• Multi data center

• DR node for safety

• Can’t do multi data center durable write safely since only 1 node in distant DC

Page 35: Replication and Replica Sets

Replica Set – 3 Data Centers

• Three data centers

• Can survive full data center loss

• Can do w= { dc : 2 } to guarantee write in 2 data centers (with tags)

Page 36: Replication and Replica Sets

Behind the Curtain

Page 37: Replication and Replica Sets

Implementation details

• Heartbeat every 2 seconds– Times out in 10 seconds

• Local DB (not replicated)– system.replset– oplog.rs• Capped collection• Idempotent version of operation stored

Page 38: Replication and Replica Sets

> db.replsettest.insert({_id:1,value:1})

{ "ts" : Timestamp(1350539727000, 1), "h" : NumberLong("6375186941486301201"), "op" : "i", "ns" : "test.replsettest", "o" : { "_id" : 1, "value" : 1 } }

> db.replsettest.update({_id:1},{$inc:{value:10}})

{ "ts" : Timestamp(1350539786000, 1), "h" : NumberLong("5484673652472424968"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" : 11 } } }

Op(erations) Log is idempotent

Page 39: Replication and Replica Sets

> db.replsettest.update({},{$set:{name : ”foo”}, false, true})

{ "ts" : Timestamp(1350540395000, 1), "h" : NumberLong("-4727576249368135876"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" : "foo" } } }

{ "ts" : Timestamp(1350540395000, 2), "h" : NumberLong("-7292949613259260138"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 3 }, "o" : { "$set" : { "name" : "foo" } } }

{ "ts" : Timestamp(1350540395000, 3), "h" : NumberLong("-1888768148831990635"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "name" : "foo" } } }

Single operation can have many entries

Page 40: Replication and Replica Sets

What’s New in 2.2

• Read preference support with sharding– Drivers too

• Improved replication over WAN/high-latency networks

• rs.syncFrom command

• buildIndexes setting

• replIndexPrefetch setting

Page 41: Replication and Replica Sets

Just Use It

• Use replica sets

• Easy to setup – Try on a single machine

• Check doc page for RS tutorials– http://docs.mongodb.org/manual/replication/

#tutorials

Page 42: Replication and Replica Sets

Job Title, 10gen

Speaker Name

#ConferenceHashTag

Thank You