2013 london advanced-replication
DESCRIPTION
In this session covered wide area replica sets and using tags for backup.TRANSCRIPT
![Page 1: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/1.jpg)
Solutions Architect, 10gen
Marc Schwering
#MongoDBDays - @m4rcsch
Advanced Replication
![Page 2: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/2.jpg)
Notes to the presenter Themes for this presentation:
• Prepare your console use properly
![Page 3: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/3.jpg)
Roles & Configuration
![Page 4: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/4.jpg)
Node 1Secondary
Node 2Arbiter
Node 3Primary
Heartbeat
ReplicationReplica Set Roles
![Page 5: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/5.jpg)
> conf = {
_id : "mySet",
members : [
{_id : 0, host : "A"},
{_id : 1, host : "B"},
{_id : 2, host : "C", "arbiter" : true}
]
}
> rs.initiate(conf)
Configuration Options
![Page 6: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/6.jpg)
Simple Setup Demo
![Page 7: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/7.jpg)
Behind the Curtain
![Page 8: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/8.jpg)
Implementation details
• Heartbeat every 2 seconds – Times out in 10 seconds
• Local DB (not replicated) – system.replset – oplog.rs • Capped collection • Idempotent version of operation stored
![Page 9: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/9.jpg)
Op(erations) Log
![Page 10: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/10.jpg)
> db.replsettest.insert({_id:1,value:1})
{ "ts" : Timestamp(1350539727000, 1), "h" : NumberLong("6375186941486301201"), "op" : "i", "ns" : "test.replsettest", "o" : { "_id" : 1, "value" : 1 } }
> db.replsettest.update({_id:1},{$inc:{value:10}})
{ "ts" : Timestamp(1350539786000, 1), "h" : NumberLong("5484673652472424968"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" : 11 } } }
Op(erations) Log is idempotent
![Page 11: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/11.jpg)
oplog and multi-updates
![Page 12: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/12.jpg)
> db.replsettest.update({},{$set:{name : ”foo”}, false, true})
{ "ts" : Timestamp(1350540395000, 1), "h" : NumberLong("-4727576249368135876"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" : "foo" } } }
{ "ts" : Timestamp(1350540395000, 2), "h" : NumberLong("-7292949613259260138"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 3 }, "o" : { "$set" : { "name" : "foo" } } }
{ "ts" : Timestamp(1350540395000, 3), "h" : NumberLong("-1888768148831990635"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "name" : "foo" } } }
Single operation can have many entries
![Page 13: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/13.jpg)
Operations
![Page 14: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/14.jpg)
Maintenance and Upgrade
• No downtime
• Rolling upgrade/maintenance – Start with Secondary – Primary last
– Commands: • rs.stepDown(<secs>) • db.version() • db.serverBuildInfo()
![Page 15: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/15.jpg)
Upgrade Demo
![Page 16: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/16.jpg)
Replica Set – 1 Data Center
• Single datacenter
• Single switch & power
• Points of failure: – Power – Network – Data center – Two node failure
• Automatic recovery of single node crash
Datacenter 2
Datacenter
Member 1
Member 2
Member 3
![Page 17: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/17.jpg)
Replica Set – 2 Data Centers
• Multi data center
• DR node for safety
• Can’t do multi data center durable write safely since only 1 node in distant DC
Member 3
Datacenter 2
Member 1
Member 2
Datacenter 1
![Page 18: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/18.jpg)
Replica Set – 2 Data Centers
• Analytics
• Disaster Recovery
• Batch Jobs
• Options – low or zero priority – hidden – slaveDelay
![Page 19: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/19.jpg)
Replica Set – 3 Data Centers
• Three data centers
• Can survive full data center loss
• Can do w= { dc : 2 } to guarantee write in 2 data centers (with tags)
Datacenter 1Member 1
Member 2
Datacenter 2Member 3
Member 4
Datacenter 3Member 5
![Page 20: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/20.jpg)
Replica Set – 3+ Data Centers
delayed Secondary
Secondary
Secondary Secondary
Secondary
Secondary
Primary
![Page 21: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/21.jpg)
Commands
• Managing – rs.conf() – rs.initiate(<conf>) & rs.reconfig(<conf>) – rs.add(host:<port>) & rs.addArb(host:<port>) – rs.status() – rs.stepDown(<secs>)
• Minority reconfig – rs.reconfig( cfg, { force : true} )
![Page 22: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/22.jpg)
Options
• Priorities
• Hidden
• Slave Delay
• Disable indexes (on secondaries)
• Default write concerns
![Page 23: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/23.jpg)
Developing with Replica Sets
![Page 24: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/24.jpg)
Secondary Secondary
Primary
Client ApplicationDriver
Write
Read
Strong Consistency
![Page 25: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/25.jpg)
Secondary Secondary
Primary
Client ApplicationDriver
Write
Read Read
Delayed Consistency
![Page 26: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/26.jpg)
Write Concern
• Network acknowledgement
• Wait for error
• Wait for journal sync
• Wait for replication – number – majority – Tags
![Page 27: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/27.jpg)
Write Concern Demo
![Page 28: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/28.jpg)
Datacenter awareness (Tagging)
• Control where data is written to, and read from
• Each member can have one or more tags – tags: {dc: "ny"} – tags: {dc: "ny",
subnet: "192.168", rack: "row3rk7"}
• Replica set defines rules for write concerns
• Rules can change without changing app code
![Page 29: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/29.jpg)
{ _id : "mySet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}}], settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} } } > db.blogs.insert({...}) > db.runCommand({getLastError : 1, w : "someDCs"}) > db.getLastErrorObj({"someDCs"})
Tagging Example
![Page 30: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/30.jpg)
Driver
Primary (SF)
Secondary (NY)
getLastError
write
W:allDCs
Secondary (Cloud)
replicate
replicate
apply inmemory
Wait for Replication
![Page 31: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/31.jpg)
settings : {
getLastErrorModes : {
allDCs : {"dc" : 3},
someDCs : {"dc" : 2}} }
}
> db.getLastErrorObj({"allDCs"},100);
> db.getLastErrorObj({”someDCs"},500);
> db.getLastErrorObj(1,500);
Write Concern with timeout
![Page 32: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/32.jpg)
Read Preference Modes
• 5 modes – primary (only) - Default – primaryPreferred – secondary – secondaryPreferred – Nearest
When more than one node is possible, closest node is used for reads (all modes but primary)
![Page 33: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/33.jpg)
Tagged Read Preference
• Custom read preferences
• Control where you read from by (node) tags – E.g. { "disk": "ssd", "use": "reporting" }
• Use in conjunction with standard read preferences – Except primary
![Page 34: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/34.jpg)
{"dc.va": "rack1", disk:"ssd", ssd: "installed" } {"dc.va": "rack2", disk:"raid"} {"dc.gto": "rack1", disk:"ssd", ssd: "installed" } {"dc.gto": "rack2", disk:"raid”} > conf.settings = { getLastErrorModes: { MultipleDC : { "dc.va": 1, "dc.gto": 1}} > conf.settings = { "getLastErrorModes" : { "ssd" : { "ssd" : 1 },...
Tags
![Page 35: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/35.jpg)
{ disk: "ssd" }
JAVA:
ReadPreference tagged_pref =
ReadPreference.secondaryPreferred(
new BasicDBObject("disk", "ssd")
);
DBObject result =
coll.findOne(query, null, tagged_pref);
Tagged Read Preference
![Page 36: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/36.jpg)
Tagged Read Preference
• Grouping / Failover {dc : "LON", loc : "EU"}
{dc : "FRA", loc : "EU"}
{dc : "NY", loc : "US”}
DBObject t1 = new BasicDBObject("dc", "LON");
DBObject t2 = new BasicDBObject("loc", "EU");
ReadPreference pref =
ReadPreference.primaryPreferred(t1, t2);
![Page 37: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/37.jpg)
Tagging Demo
![Page 38: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/38.jpg)
Conclusion
![Page 39: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/39.jpg)
Best practices and tips
• Odd number of set members
• Read from the primary except for – Geographically distribution – Analytics (separate workload)
• Use logical names not IP Addresses in configs
• Set WriteConcern appropriately for what you are doing
• Monitor secondaries for lag (Alerts in MMS)
![Page 40: 2013 london advanced-replication](https://reader034.vdocument.in/reader034/viewer/2022051818/54b716a14a7959d5738b45d7/html5/thumbnails/40.jpg)
Solutions Architect, 10gen
Marc Schwering
#MongoDBDays - @m4rcsch
Thank You