scaling with mongodb - netways€¦ · • o/s tuning • hardware configuration • vertical...

62
Scaling with mongoDB

Upload: dinhque

Post on 07-Jun-2018

237 views

Category:

Documents


0 download

TRANSCRIPT

Scaling with mongoDB

Ross LawleyPython Engineer @ 10genWeb developer since 1999

Passionate about open sourceAgile methodology

email: [email protected]: RossC0

Today's Talk• Scaling• Understanding mongoDB's architecture

• Schema design and usage• Replication• Sharding

Scaling• Operations/sec go up

• Storage needs go up• Capacity• IOPs

• Complexity goes up• Caching

• Optimization & Tuning• Schema & Index Design• O/S tuning• Hardware configuration

• Vertical scaling• Hardware is expensive• Hard to scale in cloud

How do you scale now?

$$$

throughput

mongoDB Scaling - Single Node

write

read

node_a1

Read scaling - add Replicas

write

read

node_b1

node_a1

Read scaling - add Replicas

write

read

node_c1

node_b1

node_a1

Write scaling - Sharding

shard1

write

read

node_c1

node_b1

node_a1

Write scaling - add shards

write

read

shard1

node_c1

node_b1

node_a1

shard2

node_c2

node_b2

node_a2

Write scaling - add shards

write

read

shard1

node_c1

node_b1

node_a1

shard2

node_c2

node_b2

node_a2

shard3

node_c3

node_b3

node_a3

Understanding mongoDB's architecture

http://www.flickr.com/photos/tragiclyflawed/867687742

mongod architecture• mongod memory maps the numbered & .ns files

• Memory mapping makes in-place updates effective

• File page residency decisions left to operating system

mongod Data Files• Fixed-size extents in data files store records,

indexes

• Records contain documents

• Unused records get placed in free lists

• Indexes are B-Trees

Deleted Record (Size, Offset, Next)

BSON Data

Header (Size, Offset, Next, Prev)

Padding

...

...

Collection 1

Index 1

Virtual Address Space 1

Collection 1

Index 1

Virtual Address Space 1

Collection 1

Index 1 This is your virtual memory size

(mapped)

Virtual Address Space 1

Physical RAM

Collection 1

Index 1

Virtual Address Space 1

Physical RAM

Collection 1

Index 1

This is your resident

memory size

Virtual Address Space 1

Physical RAM

DiskCollection 1

Index 1

Virtual Address Space 1

Physical RAM

Disk

Virtual Address Space 2

Collection 1

Index 1

Virtual Address Space 1

Physical RAM

DiskCollection 1

Index 1

100 ns

10,000,000 ns

=

=

mongod Concurrency• Readers block writers• A writer blocks everything• Everybody yields periodically• When a new writer queues up, new readers block

• In v2.0 and earlier, this concurrency model is global to the mongod

• In v2.2, this model will be scoped to the database

Architecture Summary• Uses memory mapped files - RAM• Faster disks - more IOPS (SSDs are good!)• CPU usage low

Schema• Data model effects performance• Embedding versus Linking• Roundtrips to database• Disk seek time• Size of data to read & write

• Partial versus full document writes• Partial versus full document reads

Indexes• Index common queries• Do not over index• (A) and (A,B) are equivalent, choose one

Query for {a: 7}

{...} {...} {...} {...} {...} {...} {...} {...} {...} {...} {...}

[-∞, 5) [5, 10) [10, ∞)

[5, 7) [7, 9) [9, 10) [10, ∞) buckets[-∞, 5) buckets

With Index

Without index - Scan

Picking an a Index

db.col.find({x: 10, y: "foo"})

scan

index on x

index on y remember

terminate

Random Index AccessHave to keep entire index in ram

randomemail address hash

Right-Balanced Index AccessOnly have to keep small portion in ram

Time BasedObjectIdAuto Increment

> db.users.ensureIndex({uname: 1, first: 1, last: 1})

> db.users.find( { uname: "RossC0" }, {_id: 0, first: 1, last: 1})

Covered IndexesUse just the index

Schema• Schema and data usage critical for scaling and

performance

• Understand data access patterns

• Use indexes but don't over index

Replication

http://www.flickr.com/photos/10335017@N07/4570943043

Replication• mongoDB replication like MySQL replication• Asynchronous master/slave

• Replica sets• A cluster of N servers• All writes to primary• Reads can be to primary (default) or a secondary• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery

How mongoDB Replication works

Member 1

Member 2

Member 3

• Set is made up of 2 or more nodes

How mongoDB Replication works

• Election establishes the PRIMARY• Data replication from PRIMARY to SECONDARY

Member 1

Member 2

Primary

Member 3

How mongoDB Replication works

• PRIMARY may fail• Automatic election of new PRIMARY if majority exists

Member 1

Member 2

DOWN

Member 3

negotiate new master

How mongoDB Replication works

• New PRIMARY elected• Replica Set re-established

Member 1

Member 2

DOWN

Member 3

Primary

How mongoDB Replication works

• Automatic recovery

Member 1

Member 3

Primary

Member 2Recovering

How mongoDB Replication works

• Replica Set re-established

Member 1

Member 3

Primary

Member2

> cfg = { _id : "myset", members : [ { _id : 0, host : "germany1.acme.com" }, { _id : 1, host : "germany2.acme.com" }, { _id : 2, host : "germany3.acme.com" } ] }

> use admin> db.runCommand( { replSetInitiate : cfg } )

Creating a Replica Set

Replica Set Member Types

• Normal {priority: 1}• Passive {priority: 0}• Cannot be elected as PRIMARY

• Arbiters• Can vote in an election• Do not hold any data

• Hidden {hidden: True}• Tagging:• {tags: {"dc": "germany", "rack": r23s5}}

Safe writesdb.runCommand({getLastError: 1, w : 1})

• ensure write is synchronous• command returns after primary has written to memory

w: n or w: 'majority'• n is the number of nodes data must be replicated to• driver will always send writes to Primary

w: 'my_tag'• Each member is "tagged" e.g. "allDCs"• Ensure that the write is executed in each tagged "region"

j: true• Ensures changes are flushed to the Journal

Replication features• Reads from Primary are always consistent

• Reads from Secondaries are eventually consistent

• Can be used to scale reads

• Automatic failover if a Primary fails

• Automatic recovery when a node joins the set

Sharding

http://www.flickr.com/photos/60218876@N08/6888405266

What is Sharding?• Ad-hoc partitioning

• Consistent hashing• Amazon Dynamo

• Range based partitioning• Google BigTable• Yahoo! PNUTS• mongoDB

mongoDB Sharding• Automatic partitioning and management

• Range based

• Convert to sharded system with no downtime

• Fully consistent

> db.runCommand({addshard: "shard1"});> db.runCommand({shardCollection: "mydb.users", key: {age: 1}})

How mongoDB Sharding works

Range keys from -∞ to +∞  Ranges are stored as "chunks"

-∞  +∞  

> db.users.save({age: 40})

How mongoDB Sharding works

Data in insertedRanges are split into more "chunks"

-∞  +∞  

-∞   40 41 +∞  

> db.users.save({age: 40})> db.users.save({age: 50})

How mongoDB Sharding works

More data insertedRanges are split into more "chunks"

-∞  +∞  

-∞   40 41 +∞  

41 50 51 +∞  

> db.users.save({age: 40})> db.users.save({age: 50})> db.users.save({age: 60})

How mongoDB Sharding works

-∞  +∞  

-∞   40 41 +∞  

41 50 51 +∞  

61 +∞  51 60

-∞  +∞  

41 +∞  

51 +∞  

-∞   40

41 50

61 +∞  51 60

> db.users.save({age: 40})> db.users.save({age: 50})> db.users.save({age: 60})

How mongo Sharding works

-∞   40

41 50

61 +∞  

51 60

shard1

> db.users.save({age: 40})> db.users.save({age: 50})> db.users.save({age: 60})

How mongo Sharding works

-∞   40

41 50

61 +∞  

51 60

> db.runCommand({addshard: "shard2"});> db.runCommand({addshard: "shard3"});

How mongoDB Sharding works

-∞   40

41 50

61 +∞  

51 60

shard1

> db.runCommand({addshard: "shard2"});> db.runCommand({addshard: "shard3"});

How mongoDB Sharding works

-∞   40

41 50

61 +∞  

51 60

shard1 shard2 shard3

> db.runCommand({addshard: "shard2"});> db.runCommand({addshard: "shard3"});

How mongoDB Sharding works

Sharding Features• Shard data without no downtime • Automatic balancing as data is written• Commands routed (switched) to correct node• Inserts - must have the Shard Key• Updates - must have the Shard Key• Queries• With Shard Key - routed to nodes• Without Shard Key - scatter gather

• Indexed Queries• With Shard Key - routed in order• Without Shard Key - distributed sort merge

Architecture

Scaling with mongoDB• Schema & Index design• Simplest way to scale

• Replication• Provides High Availabilty• Can be used to automatically scale reads

• Sharding • Automatically scale writes

Any Questions?

@mongodb

conferences, appearances, and meetupshttp://www.10gen.com/events

http://bit.ly/mongofb

Facebook | Twitter | LinkedInhttp://linkd.in/joinmongo

download at mongodb.org

support, training, and this talk brought to you by