introduction to riaknosqlroadshow.com/dl/nosql-berlin-2013/presentations/... · 2013-06-28 · ©...

57
© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only basho Concepts, Architecture and Functionality Introduction to Riak 1 Wednesday, 22 May 13

Upload: others

Post on 28-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Onlybasho

Concepts, Architecture and Functionality

Introduction to Riak

1Wednesday, 22 May 13

Page 2: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

WHAT IS RIAK?

• Key-Value store + extras

•Distributed and horizontally scalable

• Fault-tolerant

• Highly available

• Built for the web

• Based on Amazon Dynamo

2Wednesday, 22 May 13

Page 3: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

KEY-VALUE STORE

• Simple operations - GET, PUT, DELETE

• Value is opaque (mostly), with metadata

• Extras, e.g.

• Secondary Indexes (2i)

• MapReduce

• Commit Hooks

3Wednesday, 22 May 13

Page 4: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

HORIZONTALLY SCALABLE

•Default configuration is optimized for a cluster

•Query load and data are spread evenly

• Add more nodes and get more:

• ops/second

• storage capacity

• compute power (for Map/Reduce)

4Wednesday, 22 May 13

Page 5: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

FAULT TOLERANT

• All nodes participate equally - no single point of failure (SPOF)

• All data is replicated

• Cluster transparently survives...

• node failure

• network partitions

• Built on Erlang/OTP (designed for FT)

5Wednesday, 22 May 13

Page 6: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

HIGHLY AVAILABLE

• Any node can serve client requests

• Fallbacks are used when nodes are down

• Always accepts read and write requests

• Per-request quorums

6Wednesday, 22 May 13

Page 7: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CAP THEOREM

• C = Consistency

• A = Availability

• P = Partition Tolerance

• Cap theorem states that a distributed shared data system can at most support 2 out of these 3 properties

DB DB DB

Client Client

Network/Data Partition

7Wednesday, 22 May 13

Page 8: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CORE CONCEPTS (1)

•Node - An Erlang VM running an instance of Riak

• Cluster - A collection of connected Riak nodes

• Bucket - Logical grouping of objects. Shared configuration

• Key - An identifier for a record/object

• Value - Opaque binary representation of data stored with key

8Wednesday, 22 May 13

Page 9: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CORE CONCEPTS (2)

•Metadata - Additional data linked to record, not part of value

• Vector clock - Used for establishing causality of actions and tracks updates. Helps Riak resolve conflicts.

• Riak Object - Bucket, Key, Value and Metadata. Unit of replication.

• Consistent hashing - Cryptographic SHA-1 hash - 2^160

9Wednesday, 22 May 13

Page 10: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CORE CONCEPTS (3)

• Partition - Logical division of storage

• Vnode - Process handling requests and managing a partition

•Ownership Handoff - Transfer of data on cluster change

• Hinted Handoff - Transfer of data on node/network failure

•Quorum - Set of nodes required to participate in transaction

10Wednesday, 22 May 13

Page 11: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

THE RING

11Wednesday, 22 May 13

Page 12: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

REPLICATION• Replicates to designated vnode plus following (n_val -1)

vnodes

12Wednesday, 22 May 13

Page 13: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

DISASTER SCENARIO

•Node fails

• Request goes to fallback

•Node comes back

• Handoff - data retuned to recovered node

•Normal operations resume

X

X

XX

X

X

XX

hash(“user_id”)

13Wednesday, 22 May 13

Page 14: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

REQUEST QUORUMS

• Every request contacts all replicas of key

• N - number of replicas (default 3)

• R - read quorum

• W/DW - write quorum

•Quorum:The quantity of replicas that must respond to a read or write request before it is considered successful. (default 2 - Calculated as: floor(n_val / 2) + 1 )

14Wednesday, 22 May 13

Page 15: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

ANATOMY OF A REQUESTget(“user_id”)

Get Handler (FSM)

clientRiak

hash(“user_id”)== 10, 11, 12

get(“user_id”)Coordinating node

Cluster

6 7 8 9 10 11 12 13 14 15 16

The Ring

R=2

v1 v2

v1 v2

v2

15Wednesday, 22 May 13

Page 16: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

READ REPAIR

v2v2

get(“user_id”)

Get Handler (FSM)

clientRiak

Coordinating nodeCluster

6 7 8 9 10 11 12 13 14 15 16

R=2 v1 v2

v2

v1

v2v1v1 v2v2

16Wednesday, 22 May 13

Page 17: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

ANTI-ENTROPY

• Read-repair corrects inconsistencies on read only.

• Active Anti-Entropy is new in 1.3.0 and uses Merkle trees to compare data in partitions and periodically ensure consistency.

• Active Anti-Entropy runs as a background process

17Wednesday, 22 May 13

Page 18: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CONFLICT RESOLUTION

•Network partitions and concurrent actors modifying the same data cause data divergence.

• Riak provides two solutions to manage this that can be set on bucket level:

• Last Write Wins - Naive approach but works for some use cases

• Vector Clocks - Retain “sibling” copies of data for merging

18Wednesday, 22 May 13

Page 19: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

VECTOR CLOCKS

• Every node has an ID

• Send last-seen vector clock in every “put” request

• Can be viewed as ‘commit history’

• Auto-resolves stale versions

• Lets you decide conflicts

19Wednesday, 22 May 13

Page 20: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

VECTOR CLOCK EXAMPLE

0

32

1

Objectv0

[{a,1}]

1) 2)0

32

1Objectv0

Objectv0

[{a,1}]

[{a,1}]

3) 4)0

32

1Objectv1

Objectv0

[{a,2}]

[{a,1}]

0

32

1Objectv1 [{a,2}]

20Wednesday, 22 May 13

Page 21: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SIBLING CREATION

0

32

1Objectv1

Objectv1

[{a,3}]

[{a,2},{b,1}]

1) 2)[{a,3}]

[{a,2},{b,1}]

0

32

1Objectv1

Object v1

Object v1

• Siblings can be created by:

• Simultaneous writes (based on same object version)

• Network partitions

• Writes to existing key without submitting vector clock

21Wednesday, 22 May 13

Page 22: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SIBLING RESOLUTION

• If data can be represented as a monotonic set of unique data items or operations, resolution can be done through a set union, e.g shopping cart

• Store information that help resolve conflicts in or with the object.

• Convergent / Commutative Replicated Data Types are emerging to help address this problem.

22Wednesday, 22 May 13

Page 23: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SIBLING EXPLOSION

•Without sibling resolution, the number of stored versions will continually grow, resulting in degraded performance across the cluster in the form of extremely high per-operation latencies or apparent unresponsiveness.

• Frequent updates of the same object can lead to sibling explosion.

• Inserting without first checking existence through read can lead to sibling explosion if objects are not unique.

23Wednesday, 22 May 13

Page 24: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

STORAGE BACKENDS

• Bitcask

• LevelDB

•Memory

•Multi

24Wednesday, 22 May 13

Page 25: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

BITCASK

• A fast, append-only key-value store

• In memory key lookup table (key_dir)

• Closed files are immutable

•Merging cleans up old data

•Developed by Basho Technologies

• Suitable for bounded data, e.g. reference data

25Wednesday, 22 May 13

Page 26: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

LEVELDB

• Key-Value storage developed by Google

• Append-only

•Multiple levels of SSTable-like data structures

• Allows for more advanced querying (2i)

•Open Source (BSD License)

• Suitable for unbounded data or advanced querying

26Wednesday, 22 May 13

Page 27: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

MEMORY

•Data is never persisted to disk

• Typically used for “test” databases (unit tests... etc)

•Definable memory limits per vnode

• Configurable object expiry

• Useful for highly transient data

27Wednesday, 22 May 13

Page 28: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

MULTI

• Configure multiple storage engines for different types of data

• Configure the “default” storage engine

• Choose storage engine on per bucket basis

•No reason not to use it

28Wednesday, 22 May 13

Page 29: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CLIENT TYPES

• Riak supports two main client types:

• REST based HTTP Interface

• Easy to use from command line and simple scripts

• Useful if using intermediate caching layer, e.g. Varnish

• Protocol Buffers

• Optimized binary encoding standard developed by Google

• More efficient/performant than HTTP interface

29Wednesday, 22 May 13

Page 30: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CLIENT LIBRARIES

• Client libraries supported by Basho:

• Community supported languages and frameworks:

• C/C++, Clojure, Common Lisp, Dart, Django, Go, Grails, Griffon, Groovy, Erlang, Haskell, Java, .NET, Node.js, OCaml , Perl, PHP, Play, Python, Racket, Ruby, Scala, Smalltalk

30Wednesday, 22 May 13

Page 31: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

BUCKET PROPERTIES

• ‘n_val’ - number of copies of each object to be stored

• ‘allow_mult’ / ‘last_write_wins’: Boolean conflict resolution parameters (true/false)

• Tunable consistency parameters: ‘r’, ‘w’, ‘dw’, ‘rw’, ‘pw’ and ‘pr’

• Allowed values: ‘all’, ‘quorum’, ‘one’ or an integer (default: ‘quorum’ for r/w/dw/rw, 0 for pw/pr)

• ‘precommit’, ‘postcommit’ and ‘backend’

31Wednesday, 22 May 13

Page 32: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CONSISTENCY PARAMETERS (1)

• R - Number of vnodes that need to agree when retrieving the object before returning a response

•W - Number of vnodes that must confirm receiving writes before returning a successful response

•DW - Number of replicas to commit to durable storage before returning a successful response (minimum 1from Riak 1.3 onwards)

32Wednesday, 22 May 13

Page 33: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

CONSISTENCY PARAMETERS (2)

• RW - Quorum for both operations (get and put) involved in deleting an object

• PR - Number of nodes read from that must not be fallback nodes. Setting this > 0 MAY cause reads to fail under certain network partitioning scenarios.

• PW - Number of replicas to commit to primary nodes before returning a successful response. Setting this > 0 MAY cause writes to fail under certain network partitioning scenarios.

33Wednesday, 22 May 13

Page 34: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

TUNABLE CONSISTENCY

• R, W, DW, RW, PR, PW tunable per bucket as well as per request

• R + W > n_val provides consistency in fully operational cluster.

• n_val = 3 and R, W = ‘quorum’ (2) means one node that is slow or down can be tolerated (default setting)

34Wednesday, 22 May 13

Page 35: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

GETTING BUCKET PROPERTIES

• Bucket properties best retrieved via HTTP interface

• URL path is /buckets/<bucket_name>/propscurl -X GET http://127.0.0.1:8098/buckets/test/props{"props":{"name":"test","allow_mult":false,"basic_quorum":false, "big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum", "last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker", "fun":"mapreduce_linkfun"},"n_val":3,"notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum", "small_vclock":50,"w":"quorum","young_vclock":20}}

35Wednesday, 22 May 13

Page 36: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SETTING BUCKET PROPERTIES

•Default bucket properties can be specified in the app.config file:

curl -X PUT -H "Content-Type: application/json" -d '{"props":{"allow_mult":true,”dw”:1}}' http://127.0.0.1:8098/buckets/test/props

{default_bucket_props, [ {n_val,3}, {allow_mult,true}, {last_write_wins,false}]}

•Override of default bucket properties best done via HTTP interface as protocol buffers do not support all parameters

36Wednesday, 22 May 13

Page 37: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

WHAT ARE SECONDARY INDEXES?

•Non-primary key lookups

•Defined as metadata

• Requires memory or LevelDB backend

• Two index types: integer and binary

• Two query modes: exact match or range query

• An index can have multiple values for an object

37Wednesday, 22 May 13

Page 38: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SECONDARY INDEXES (1)

• There are two special indexes automatically available: ‘$bucket’ and ‘$key’

• Indexes must follow naming convention: ‘<name>_int’ for integer indexes and ‘<name>_bin’ for binary indexes

•Queries returns keys, not objects

• Indexes can be created by adding metadata to Riak objects.

• 2i query can be used as input to Map/Reduce

38Wednesday, 22 May 13

Page 39: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SECONDARY INDEXES (2)

• Secondary indexes can only be queried one at a time. Get around this by creating composite indexes, e.g. <customer_id>_<date>, that are suitable for exact match or range query based on your query patterns

• Uses document-based partitioning, stored locally with object.

• All queries requires covering set of vnodes (ring_size / n_val)

39Wednesday, 22 May 13

Page 40: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

DATA MODELING

• Content-Types

• Identify and plan for your query patterns

• Use natural and meaningful, possibly composite, keys that allows retrieval by key, as this is the by far most efficient query method and enhances scalability.

•De-normalize data

• Time-boxing

40Wednesday, 22 May 13

Page 41: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

MODELING TOOLS

• Key-Value

• Secondary Indexes (2i)

• Full-text search

•Map/Reduce

• Links

41Wednesday, 22 May 13

Page 42: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

FULL-TEXT SEARCH

•Designed for searching prose/JSON/XML

• Lucene/Solr-like query interface

• Automatically indexes k/v pairs

• Can be used as input to Map/Reduce

• Customizable index schemas

• Flexible and puts less load on system than MapReduce

42Wednesday, 22 May 13

Page 43: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

MAP/REDUCE

• For more involved queries

• Specify the input keys and process data in sequence of “map” and “reduce” functions

• Javascript or Erlang (JavaScript not recommended for heavy production use)

•Not designed for real-time processing

• Requires a covering set of vnodes to participate

43Wednesday, 22 May 13

Page 44: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

LINKS

• Lightweight relationships, like <a>

• Includes a “tag”

• Built-in traversal operation (“walking”)GET /riak/b/k/[bucket],[tag],[keep]

• Limited in number (part of metadata)

• Built on top of Map/Reduce

44Wednesday, 22 May 13

Page 45: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

ACCESS PATTERNS (1)

• Analyze access and query patterns, including frequency, in design phase and optimize data model for these.

• Unlike in relational databases, modifying records already in the database is relatively expensive. Adding or modifying data/indexes requires read and write of every object.

• Try to perform majority of data access directly through keys for performance and scalability whenever possible

45Wednesday, 22 May 13

Page 46: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

ACCESS PATTERNS (2)

• For data that will never be updated, enable last_write_wins for increased performance

• Consider Bitcask as a backend if records need to be expired from the system within a fixed amount of time. Deleting objects in Riak is relatively expensive as it involves read and write.

46Wednesday, 22 May 13

Page 47: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SEMANTIC KEYS

• Always try to pick keys that are natural and contain information about the record.

• Avoid UUIDs if this means direct key access can not be used

• Semantic keys allows for efficient, direct lookups.

• Semantic keys allows use of key filters which can make migrations and bulk processing through MapReduce easier if necessary.

47Wednesday, 22 May 13

Page 48: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SIBLING MANAGEMENT

• Enable siblings wherever data is updated and loss of data is not acceptable

•Determine strategy for sibling resolution for all applicable data

• If possible, consider serializing writes in the application layer to avoid/reduce sibling creation for frequently updated objects

48Wednesday, 22 May 13

Page 49: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

DENORMALIZATION

CustomerCustomerContact

Order

Order Item

InvoiceCopy

CustomerAddress

Customer[Customer Info]

[Contacts][Addresses][Invoices][Orders]

Order[Order Details][Order Items]

[Invoice]

InvoiceCopy

Key: <customer_id>

Key: <customer_id>_<date>_<order_id>

Key: <customer_id>_<date>_<invoice_number>

49Wednesday, 22 May 13

Page 50: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

INDEX OBJECTS

CustomerOrders

Order

Order

Order

Order

Order

Order

Order

Index object with ID of order objectsand information that can be used

for filtering and identifying

50Wednesday, 22 May 13

Page 51: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

SECONDARY INDEX USAGE

• Stored with object and can therefore be maintained easily and updated with changing object values

• Create multiple indexes to allow for different types of searches.

• Create composite indexes in order to work around limitation that only match or simple range queries are possible.

• Avoid having it as main access method

51Wednesday, 22 May 13

Page 52: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

MULTIGETS

• Retrieve multiple records most efficiently by using several connections/threads to issue GET requests in parallel. This is the most scalable method.

•Do not use MapReduce for retrieving multiple records as this does not scale as well as direct KV access and does not do allow for quorum read.

52Wednesday, 22 May 13

Page 53: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

TIME BOXING AND ROLLUPS (1)HourMinute Day

GOOG_20130213_0900

GOOG_20130213_0959

GOOG_20130213_1000

GOOG_20130213_1059

GOOG_20130214_0900

GOOG_20130214_0959

GOOG_20130213_09

GOOG_20130213_10

GOOG_20130214_09

...

...

...

GOOG_20130213

GOOG_20130214

Hourly batch rollup Daily batch rollup

Writes and Updates

53Wednesday, 22 May 13

Page 54: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

TIME BOXING AND ROLLUPS (2)

• Sensible key choices allow for direct access based on data

• Several rollups of base data can be performed in order to allow different access or query patterns

• Bulk updates can be done by external application or possibly even through MapReduce

• Application logic or commit hooks can be used to catch out of bounds data and update rollups

54Wednesday, 22 May 13

Page 55: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

MULTI DATACENTER REPLICATION (MDC)

• Allows data to be replicated between clusters in different data centers. Can handle larger latencies.

• Two synchronization modes that can be used together: real-time and full sync

• Set up as uni-directional links. 2 links can be set up for bi-directional replication.

• Can be used for backing up data

55Wednesday, 22 May 13

Page 56: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

RIAK-CS

• Built on top of Riak and supports MDC

• Exposes a Amazon S3 API compatible interface

• Supports multi-tenancy

• Per-tenant usage data and statistics on network I/O

• Supports Objects of Arbitrary Content Type Up to 5GB

•Often used to build private cloud storage

56Wednesday, 22 May 13

Page 57: Introduction to Riaknosqlroadshow.com/dl/NoSQL-Berlin-2013/Presentations/... · 2013-06-28 · © Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use

© Copyright 2012-2013, Basho Technologies Inc. All rights reserved For Internal Use Only

QUESTIONS?Christian Dahlqvist, [email protected]

57Wednesday, 22 May 13