evolving nosql databases without downtime · evolving nosql databases without downtime karla saur -...

24
Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland Michael Hicks - University of Maryland

Upload: others

Post on 17-Mar-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Evolving NoSQL Databases Without Downtime

Karla Saur - University of Maryland (now at Intel Labs)Tudor Dumitraș - University of Maryland

Michael Hicks - University of Maryland

Page 2: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

NoSQL Database Popularity

`

2

Page 3: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

VALUEAnything goes:

JSON, Avro, Thrift, Protobuf, data structures,

etc

KEYString such as "objtype:id"

Updating Challenges:– Schema is implicitly defined, no set types/structure

– Large amounts of data with multiple parties accessing the data

Updating Key-Value Store Data

Key-value store: type of NoSQL database, maps key→value

3

Page 4: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

App v0

Keys Values

key1 {"val": {"name": "Bob"}}

App v1

Updating Key-Value Store Data

Goals:

• Update the application's data in a key-value store

• On-line, without excessive delays or interruptions

4

Page 5: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

App v0 App v1

Updating Key-Value Store Data

Keys Values

key1 {"person": {"firstname":

"Bob"}}

Goals:

• Update the application's data in a key-value store

• On-line, without excessive delays or interruptions

4

Data updated to new version

Page 6: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve Contributions

KVolve: Key-Value store evolve

• General approach to updating NoSQL key/values lazily

• Provides template framework to assist writing update specs

• Minimal changes to programs, no juggling multi-version code

• Near no overhead normal operation, minimal impact during update

Built as an extension to key-value store Redis

5

Page 7: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Background: Keys

• Conceptually divide the kinds of objects stored in the database

• Redis advises convention:

–Keys should have the format n : k n = namespace, k = key name

• Example:

–Key: order:1234 Val: {'id' : '222', 'name' : 'foo'}

–Key: customer:johndoe111 Val: {'email' : '[email protected]'}

6

Page 8: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

v0

{

"_id": "4BD8AE97C4A580",

"customerid": 99999,

"name": "Foo Sushi Inc",

"since": "12/12/2012",

"order": {

"orderid": "UXWE−122012",

"orderdate": "12/12/2001",

"orderItems": [

{"product": "Fortune Cookies",

"price": 19.99}

]

}}

Background: Values

7

Page 9: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

{

"_id": "4BD8AE97C4A580",

"customerid": 99999,

"name": "Foo Sushi Inc",

"since": "12/12/2012",

"order": {

"orderid": "UXWE−122012",

"orderdate": "12/12/2001",

"orderItems": [

{"product": "Fortune Cookies",

"fullPrice" : 19.99,

"discountedPrice": 16.99}

]

}}

v0 v1

{

"_id": "4BD8AE97C4A580",

"customerid": 99999,

"name": "Foo Sushi Inc",

"since": "12/12/2012",

"order": {

"orderid": "UXWE−122012",

"orderdate": "12/12/2001",

"orderItems": [

{"product": "Fortune Cookies",

"price": 19.99}

]

}}

JSON Value Example “Schema” Change

7

Page 10: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve Architecture

get Ky

get Kx

v0

v0

KyvalY

Redis + KVolve

Client_Y

v0

Kx valX v0Client_X

8

● Apps know their version number

● DB entries are individually tagged with the version number

● KVolve uses these tags to track updating

Page 11: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve Architecture

get Ky

upd v0to v1

get Kx

v1

v0upd spec

v0

v1

KyvalY

Redis + KVolve

Client_Y

v0

Kx valX v0Client_X

Client_U

8

● The update spec defines functions that take a v0 value and produce a v1 value

● Connected clients are notified of the update

Page 12: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve Architecture

get Ky

upd v0to v1

get Kx

v1

v0upd spec

v0

v1

KyvalY

valYupd

Redis + KVolve

Client_Y

v0

v1

Kx valX v0Client_X

Client_U

8

● As clients access the database entries, they are updated, so that the data is updated on-demand (lazily)

● We support limited upgrades to keys as well

Page 13: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Control Flow for Redis and KVolve

● Intercepts Redis commands, prior to normal processing

● Adds a version field to Redis value structure

set Kx

my_val

Redis + KVolve

void processBuff(redisClient *c){ kvProcCmd(c); procCmd(c);}

kvProcCmd(..){

}

robj * vtype : stringptr : “my_val”encoding : 0vers : 1

12

KVolve

9

Page 14: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Control Flow for Redis and KVolve

● Version field automatically added to the Redis database

● Control flow proceeds as usual after version check

set Kx

my_val

Redis + KVolve

void processBuff(redisClient *c){ kvProcCmd(c); procCmd(c);}

kvProcCmd(..){

}

robj * vtype : stringptr : “my_val”encoding : 0vers : 1

12

procCmd(c){ // c->v->vers // set by KVolve set(c->k, c->v); addReply(c)}

KVolve3

4

9

Page 15: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

set Kx

my_val

Redis + KVolve

void processBuff(redisClient *c){ kvProcCmd(c); procCmd(c);}

kvProcCmd(..){

}

robj * vtype : stringptr : “my_val”encoding : 0vers : 1

12

procCmd(c){ // c->v->vers // set by KVolve set(c->k, c->v); addReply(c)}

KVolve3

4

Control Flow for Redis and KVolve

● Version field automatically added to the Redis database

● Control flow proceeds as usual after version check

Version check may

lead to update

9

Page 16: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Program's code can be any language, update spec in C

● Function to describe the update information:

void kvolve_upd_spec(char *from_ns, char *to_ns, int from_vers, int to_vers, int n_funs, ...);

● Function to transform the values:

typedef void (*kvolve_upd_fun)(char **key, void **value, size_t *val_len);

KVolve Update Specifications

10

Page 17: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

void fun_upd_price(...){ json_t *ele; ... json_object_set(ele, "fullPrice", "price"); ...}

kvolve_upd_spec("order", "order", 0, 1, 1, fun_upd_price);

KVolve Update Specifications: Values

"orderItems": [

{"product": "Fortune Cookies",

"fullPrice" : 19.99,

"discountedPrice": 16.99}

]

"orderItems": [

{"product": "Fortune Cookies",

"price": 19.99}

]

Update Spec: Update Function:

Version 0 Version 1

11

Page 18: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

amico:followers:alice

amico:followers:bob

amico:followers:charlie

amico:followers:eve

kvolve_upd_spec( "amico:followers", "amico:followers:default", 1, 2, 0);

KVolve Update Specifications: Keys

(none)

amico:followers:default:alice

amico:followers:default:bob

amico:followers:default:charlie

amico:followers:default:eve

Version 2Version 1

Update Spec: Update Function:

12

Page 19: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve: Steady-state overhead

• We benchmarked KVolve for various configurations (with and without existing updates installed) across several types of data structures (strings, lists, etc)

• During normal (non-update) situations:

– Overhead is in the noise for single instruction (-0.77% to 2.49%)

– Overhead is minimal for pipelined instructions (sending instructions as a batch yielded -0.52% to 5.74% overhead)

13

Page 20: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve: Application Benchmarks

From GitHub: examples of schema changes with Redis

● Amico models relationships for social networks: “A follows B”

– 200 lines of Ruby code, 10 versions 2012-2013

– Added a prefix to all keys

● Redisfs uses Redis as the backend to the FUSE file system

– 2.2K lines of C code, 8 versions 2011

– Added prefix to some keys, added compression to all data

14

Page 21: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve: Application Benchmarks

800 850 900 950 1000 1050 1100Seconds

Qu

eri

es

Pe

r S

ec

on

d (

QP

S)

Eager

KVolve

Lazy Upd

Amico

● Renamed ~792K keys from LiveJournal data set

● Offline migration took ~87s

Green line (right y-axis) shows lazy transformation

15

Page 22: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

KVolve: Application Benchmarks

40 50 60 70 80 90 100 110 120 130

Eager

KVolve Lazy Upd

Seconds

Qu

eri

es P

er

Se

co

nd

(Q

PS

)

create files

←access files

RedisFS

● Updated ~123K total keys for files generated by PostMark

● Combined w/Kitsune [Hayden et al. TOPLAS 2014] for no downtime

Green line (right y-axis) shows lazy transformation

16

Page 23: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Future Work

● Distributed data updates:

Using locking/consensus mechanisms to expand beyond centralized Redis instance

● Automatic generation of transformation functions in NoSQL updates:

A DSL or other tool to help update-writers

17

Page 24: Evolving NoSQL Databases Without Downtime · Evolving NoSQL Databases Without Downtime Karla Saur - University of Maryland (now at Intel Labs) Tudor Dumitraș - University of Maryland

Thanks!

Code:https://github.com/plum-umd/kvolve

18