初探aws 平台上的 nosql 雲端資料庫服務

43
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 蔣宗恩, Technical Account Manager, AWS Enterprise Support 2017/06 Getting Started with NoSQL Cloud Database Service on AWS

Upload: amazon-web-services

Post on 21-Jan-2018

433 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: 初探AWS 平台上的 NoSQL 雲端資料庫服務

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

蔣宗恩, Technical Account Manager, AWS Enterprise Support

2017/06

Getting Started with NoSQL Cloud

Database Service on AWS

Page 2: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Agenda

1. What is NoSQL?

2. Relational (SQL) vs. non-relational?

3. What is DynamoDB?

4. DynamoDB Tables & Indexes

5. Scaling

6. Integration Capabilities

7. Demo

Page 3: 初探AWS 平台上的 NoSQL 雲端資料庫服務

What is NoSQL?

Page 4: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Data volume since 2010

• 90% of stored data generated in

last 2 years

• 1 terabyte of data in 2010 equals

6.5 petabytes today

• Linear correlation between data

pressure and technical innovation

• No reason these trends will not

continue over time

Page 5: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Timeline of database technologyD

ata

Pre

ssu

re

Page 6: 初探AWS 平台上的 NoSQL 雲端資料庫服務

What is NoSQL?

NoSQL is a term to describe data stores that trade full ACID

compliance for high availability and scale.

A

C

I

D

tomicity

onsistency

solation

urability

Single row/single item only

Eventual consistency

Dirty Read

Data replication on commodity storage

Page 7: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Why NoSQL?

• Dirty Reads?

• Eventual Consistency?

• Single row transactions only?

• Why would anybody trade ACID compliance for this?

Page 8: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Relational (SQL) vs. non-relational?

Page 9: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Relational vs. non-relational databases

Traditional SQL NoSQL

DB

Primary Secondary

Scale up

DB

DB

DBDB

DB DB

Scale out

Page 10: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Scale Up vs Scale Out

Page 11: 初探AWS 平台上的 NoSQL 雲端資料庫服務

The CAP Theorem

Network partitions will happen in

distributed systems:

DB

DBDB

DB DB

Consistency

Availability

Partition Tolerance

C A

P

CA

APCP

Page 12: 初探AWS 平台上的 NoSQL 雲端資料庫服務

SQL vs. NoSQL schema design NoSQL design optimizes for

compute instead of storage

Page 13: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Why NoSQL?

Optimized for storage Optimized for compute

Normalized/relational Denormalized/hierarchical

Ad-hoc queries Instantiated views

Scale vertically Scale horizontally

Good for OLAP Built for OLTP at scale

SQL NoSQL

Page 14: 初探AWS 平台上的 NoSQL 雲端資料庫服務

What is DynamoDB?

Page 15: 初探AWS 平台上的 NoSQL 雲端資料庫服務

RDBMSDynamoDB

Amazon’s Path to DynamoDB

Page 16: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Amazon DynamoDB

DynamoDB is a fully managed, NoSQL document and key value data store

Predictable Performance

Highly Available

Massively Scalable

Fully Managed

Low Cost

Page 17: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Consistently low latency at scale

PREDICTABLE

PERFORMANCE!

Page 18: 初探AWS 平台上的 NoSQL 雲端資料庫服務

WRITES

Replicated continuously to 3

Availability Zones

Persisted to disk (custom SSD)

READS

Strongly or eventually consistent

No latency trade-off

Designed to

support 99.99%

of availability

Built for high

durability

High availability and durability

Page 19: 初探AWS 平台上的 NoSQL 雲端資料庫服務

High availability and durability

DynamoDB automatically partition data

• Partition key spreads data (and workload) across partitions

• Automatically partitions as data grows and throughput needs

increase

High-scale

APP

Large number of unique hash keys

+

Uniform distribution of workload

across hash keys

Partition 1..N

Page 20: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Fully managed service = automated operations

DB hosted on premises DynamoDB

Page 21: 初探AWS 平台上的 NoSQL 雲端資料庫服務

DynamoDB Tables & Indexes

Page 22: 初探AWS 平台上的 NoSQL 雲端資料庫服務

DynamoDB table structureTable

Items

Attributes

Partitionkey

Sortkey

Mandatory

Key-value access pattern

Determines data distribution Optional

Model 1:N relationships

Enables rich query capabilities

All items for key==, <, >, >=, <=“begins with”“between”“contains”“in”sorted resultscountstop/bottom N values

Page 23: 初探AWS 平台上的 NoSQL 雲端資料庫服務

00 55 A954 FFAA00 FF

Partition Keys

Id = 1

Name = Jim

Hash (1) = 7B

Id = 2

Name = Andy

Dept = Eng

Hash (2) = 48

Id = 3

Name = Kim

Dept = Ops

Hash (3) = CD

Key Space

Partition Key uniquely identifies an item

Partition Key is used for building an unordered hash index

Allows table to be partitioned for scale

Page 24: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Partition 3

Partition:Sort Key uses two attributes together to uniquely identify an Item

Within unordered hash index, data is arranged by the sort key

No limit on the number of items (∞) per partition key

Except if you have local secondary indexes

Partition:Sort Key

00:0 FF:∞

Hash (2) = 48

Customer# = 2

Order# = 10

Item = Pen

Customer# = 2

Order# = 11

Item = Shoes

Customer# = 1

Order# = 10

Item = Toy

Customer# = 1

Order# = 11

Item = Boots

Hash (1) = 7B

Customer# = 3

Order# = 10

Item = Book

Customer# = 3

Order# = 11

Item = Paper

Hash (3) = CD

55 A9:∞54:∞ AAPartition 1 Partition 2

Page 25: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Partitions are three-way replicated

Id = 2

Name = Andy

Dept = Engg

Id = 3

Name = Kim

Dept = Ops

Id = 1

Name = Jim

Id = 2

Name = Andy

Dept = Engg

Id = 3

Name = Kim

Dept = Ops

Id = 1

Name = Jim

Id = 2

Name = Andy

Dept = Engg

Id = 3

Name = Kim

Dept = Ops

Id = 1

Name = Jim

Replica 1

Replica 2

Replica 3

Partition 1 Partition 2 Partition N

Page 26: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Local secondary index (LSI)

Alternate sort key attribute

Index is local to a partition key

A1

(partition)

A3

(sort)

A2

(item key)

A1

(partition)

A2

(sort)A3 A4 A5

LSIsA1

(partition)

A4

(sort)

A2

(item key)

A3

(projected)

Table

KEYS_ONLY

INCLUDE A3

A1

(partition)

A5

(sort)

A2

(item key)

A3

(projected)

A4

(projected)ALL

10 GB max per partition key, i.e.

LSIs limit the # of range keys!

Page 27: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Global secondary index (GSI)

Alternate partition and/or sort key

Index is across all partition keys

Use composite sort keys for compound indexes

A1

(partition)A2 A3 A4 A5

A5

(partition)

A4

(sort)

A1

(item key)

A3

(projected)INCLUDE A3

A4

(partition)

A5

(sort)

A1

(item key)

A2

(projected)

A3

(projected)

ALL

A2

(partition)

A1

(itemkey)KEYS_ONLY

GSIs

TableRCUs/WCUs provisioned

separately for GSIs

Online indexing

Page 28: 初探AWS 平台上的 NoSQL 雲端資料庫服務

How do GSI updates work?

Table

Primary

tablePrimary

tablePrimary

tablePrimary

table

Global

Secondary

Index

Client

2. Asynchronous

update (in progress)

If GSIs don’t have enough write capacity, table writes will be throttled!

Page 29: 初探AWS 平台上的 NoSQL 雲端資料庫服務

LSI or GSI?

LSI can be modeled as a GSI

If data size in an item collection > 10 GB, use GSI

If eventual consistency is okay for your scenario,

use GSI!

Page 30: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Scaling

Page 31: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Scaling

Throughput

Provision any amount of throughput to a table

Size

Add any number of items to a table

- Max item size is 400 KB

- LSIs limit the number of range keys due to 10 GB limit

Scaling is achieved through partitioning

Page 32: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Throughput

Provisioned at the table level

Write capacity units (WCUs) are measured in 1 KB per second

Read capacity units (RCUs) are measured in 4 KB per second

- RCUs measure strictly consistent reads

- Eventually consistent reads cost 1/2 of consistent reads

Read and write throughput limits are independent

WCURCU

Page 33: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Partitioning Math

In the future, these details might change…

Number of Partitions

By Capacity (Total RCU / 3000) + (Total WCU / 1000)

By Size Total Size / 10 GB

Total Partitions CEILING(MAX (Capacity, Size))

Page 34: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Partitioning Example

Table size = 8 GB, RCUs = 5000, WCUs = 500

RCUs per partition = 5000/3 = 1666.67

WCUs per partition = 500/3 = 166.67

Data/partition = 10/3 = 3.33 GB

RCUs and WCUs are uniformly

spread across partitions

Number of Partitions

By Capacity (5000 / 3000) + (500 / 1000) = 2.17

By Size 8 / 10 = 0.8

Total Partitions CEILING(MAX (2.17, 0.8)) = 3

Page 35: 初探AWS 平台上的 NoSQL 雲端資料庫服務

What causes throttling?

If sustained throughput goes beyond provisioned throughput per partition

Non-uniform workloads

Hot keys/hot partitions

Very large bursts

Mixing hot data with cold data

Use a table per time period

From the example before:

Table created with 5000 RCUs, 500 WCUs

RCUs per partition = 1666.67

WCUs per partition = 166.67

If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition, DynamoDB

may throttle requests

- Solution: Increase provisioned throughput

Page 36: 初探AWS 平台上的 NoSQL 雲端資料庫服務

To learn more, please attend:

Deep Dive on Amazon DynamoDB 3:55 p.m.– 4:35 p.m.

Page 37: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Integration Capabilities

Page 38: 初探AWS 平台上的 NoSQL 雲端資料庫服務

DynamoDB Streams

Stream of table update

Asynchronous

Exactly once

Strictly ordered

24-hr lifetime per item

Integration Capabilities

DynamoDB Triggers

Implement as AWS lambda

function

Your code scale automatically

Java, Node.js and Python

Page 39: 初探AWS 平台上的 NoSQL 雲端資料庫服務

IAM

Fine-grained access control

via AWS IAM

Table-,Item, and attribute- level

access control

Integration Capabilities

ElasticSearch integration

Full-text queries

Add search to mobile app

Monitor IoT sensor status

code

App telemetry pattern

discovery using regular

expressions

Page 40: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Reference Architecture

Page 41: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Demo

Page 42: 初探AWS 平台上的 NoSQL 雲端資料庫服務

Architecture of a simple serverless web

application

AWS Identity &

Access

ManagementDynamoDBAPI

Gateway

JavaScript

users

Amazon

S3 Bucket

internet

Lambda

Page 43: 初探AWS 平台上的 NoSQL 雲端資料庫服務

bit.ly/NoSQLDesignPatterns