google cloud and couchbase server: zero to millions of operations in no time: couchbase connect 2015

23
Couchbase on Google Cloud Platform Ivan Santa Maria Filho, Performance Engineer/Technical Lead @ Google David Haikney, Technical Support Manager @ Couchbase

Upload: couchbase

Post on 26-Jul-2015

135 views

Category:

Technology


3 download

TRANSCRIPT

Couchbase on Google Cloud PlatformIvan Santa Maria Filho, Performance Engineer/Technical Lead @ Google David Haikney, Technical Support Manager @ Couchbase

Decreasing cost enables virtually limitless storage in the cloud. $600 can buy enough storage for the world’s music.

(Source: McKinsey Global Institute May 2011)

Computing as a utility is now available for easy purchase, provided from massively efficient data centers.

The internet allows for a model of real-time access to new innovation, information, and applications from a wide range of devices.

Affordablecapacity

On-demandcomputing

Instantaccess

Trends

“People tend to overestimate what can be done in one year and to underestimate what can be done in five or ten years.”

Amara's Law

75years

1957 2003 2013

500

25years

10years

(average age of a company joining the S&P 500)

A Global Software Based Network

For the past 15 years, Google has been building out the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

Google has been running some of the world’s largest distributed systems with unique and stringent requirements.

SpannerDremelMapReduce

Bigtable Colossus

2012 20132002 2004 2006 2008 2010

GFSCompute

Engine

Innovating Software & Driving Technology Forward

Management

Mobile

Developer ToolsCompute

Networking

Big Data

Services

Storage

Best Price/Performance, Efficient, Highest Throughput - Lowest Latency, Predictable, Scalable, Easy to Analyze

Deliver the Best Performing Cloud Platform

Provide expert benchmarking

Support sales on competitive deals

Capture a competitive view of our products to

support business decisions

Drive end-to-end performance

improvements with internal teams

Develop a vibrant community of

benchmarkers and partners for cloud

performance

Guide our customers to better performance

through talks, papers and tools

ManagementNetworkingCompute Big Data Storage Mobile DeveloperTools

Compute Engine

Container Engine

App Engine

Compute

Manage your infrastructure

Flexibility Agility

Google Compute Engine

Your Code

Replica Pools Provisioning and health checking

Managed VMs OS management, deployments, logging and monitoring

Your CodeYour Code App Engine

Managed Runtimes

Manage your serving stack

Your Code

Compute as a Spectrum

• Flexible and Familiar Infrastructure

• High Data Security

• Connect with the Google Network

• Fast and Easy Provisioning

• Flexible Billing

• Large and Powerful Disks

• Green Computing

• Partner Powered

Compute Engine Value

• Sub-hour Billing & Sustained-Use Discounts

• Up to 10TB Persistent Disk

• Over 44 Instance Types

• Advanced Networking

• Instance Metadata and Startup Scripts

• Load Balancing

• Monitoring

• Snapshotting

Compute Engine Features

Persistent Disks

Storage System

Storage Node

Cache

Storage

...

Storage Engine Storage Log

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Virtualization Layer

OS IO Subsystem

VMComputer

OS IO Subsystem

Network

• Exposed as block devices

• “Sectors” are spread over several servers

• Read/write on one VM

• Read-only on up to 500 VMs

• Point in time snapshot to Google Cloud Storage

• Differential snapshots

• Global snapshot replication, restore anywhere in the world

Standard PDBoot VolumesSmall Fileservers and DBsStreaming IO

Consistent ThroughputBurst Capability

Any size from 1GB to 10TB

YesYesYesYesYes

SSD PDHeavy Use Databases

Assured Throughput

Any size from 1GB to 10TB

YesYesYesYesYes

Local SSDInternet Scale DBsHigh Performance ScratchHadoop*

Assured ThroughputSub-ms Latency

Up to 4 x 400GB

YesYesNoNoNo

Uses

Performance

Capacity

EncryptionLive migrationRedundancySnapshotsChecksums

20080

Local SSDIOPS/GB (log scale)

0

Standard PD

302 156

SSD PDPriced Per GB. No IO charges.

Couchbase on Google Compute Engine

The challenge

Several published claims of:

“1 Million writes per second with x nodes”

How does Couchbase measure up?.....

Making the smallprint bigger

3 Billion Documents 100% of documents 200 Bytes

Document SizeDataset Size

HardwareLatencyDurability

Working Set

Measured by clientafter replication

50 x n1-standard-16500GB PD-SSD

replicated to additional server

See next slide

Transparency matters: github.com/couchbaselabs/google_compute_benchmark

When has your write been written?• Couchbase acknowledges writes at the managed cache• Application can request callbacks:

• When item is persisted to disk• When item has been replicated to another node

• Typically reserved for significant writes

• This benchmark uses replicated latency• Guards against catastrophic loss of node + disk

Latency for each op calculated by client as follows:

(i) Clock starts ticking when client fires op

(ii) Clock stops only when client confirms second node has a copy

The Benchmark Environment• 32 Clients (n1-highcpu-8)

• 8 CPUs• Running libcouchbase pillowfight

• 50 Servers (n1-standard-16)• 16 CPUs• 60 GB RAM• 500GB PD-SSD• Couchbase 3.0.2• debian-wheezy backports

Tuning

OS• Separate data volume• Journaling disabled

Clients• Number of clients (32)• Number of threads (4)• 100% sets• Buffered not batched

Server• Increased compaction

threshold• Fewer writer threads (2)

90% achieved straight out of the box

Results - 1 Million Writes per Second

Results

Data Set 3 Billion 100 Million

Working Set 100% 100%

Sustained writes / s 1.1 Million 1 Million

Median Latency 14 ms 18 ms

95th% latency 27 ms 36 ms

99th% latency 38 ms 48 ms

Instances n1-standard-16 n1-standard 8

Servers 50 40

Storage 500GB SSD-PD 500GB PD

Price / Performance $56.30 $45.90/hr

$21.28 $17.12/hr

Working With Google Compute Engine

Built for Big Data

● Billions of documents● Millions of operations per second● Consistent low latency even under high load● Storage throughput● Excellent price / performance

○ just $17.12/hr! Before sustained use discounts

Fantastic Ease-of use

● CLI tools make authentication and scripting simple● Docs both simple and comprehensive

○ copy and paste examples● Great provisioning times● Performant out of the box

The Last Word

Couchbase + Google Compute EnginePerformance. Scalability. Availability

cloud.google.comTry our open source benchmarking: https://github.com/GoogleCloudPlatform/PerfKitBenchmarker