mio - a distributed skip graph based orderd kvs

30
Mio a distributed Skip Graph based orderd KVS Cybozu Labs, Inc. Taro Minowa (Higepon)

Upload: taro-higepon

Post on 22-Apr-2015

18.010 views

Category:

Technology


1 download

DESCRIPTION

Mio is...a distributed orderd KVSmemcached + range searchSkip Graph basedWritten in Erlanghttp://github.com/higepon/mio

TRANSCRIPT

Page 1: Mio - a distributed Skip Graph based orderd KVS

Mioa distributed Skip Graph based orderd KVS

Cybozu Labs, Inc.Taro Minowa (Higepon)

Page 2: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Introduce myself

@higeponMona OS

http://www.monaos.org

MoshA fast Scheme interpreter

Outputzhttp://outputz.com

2

Page 3: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Summary

Mio is...a distributed orderd KVSmemcached + range searchSkip Graph basedWritten in Erlanghttp://github.com/higepon/mioIn alpha quality

3

Page 4: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Background

4

Page 5: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

RDBMS vs KVS

5

KVS

RDBMS

Scalability

High functionality

set/get

volatile

Transaction

SQL

Page 6: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

RDBMS vs KVS

5

KVS

RDBMS

Scalability

High functionality

set/get

volatile

Transaction

SQL

Complement each other

Page 7: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Mio

6

KVS

RDBMS

Scalability

High functionality

Page 8: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Mio

6

KVS

RDBMS

Scalability

High functionality

Mio

+Range search

Page 9: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Mio

6

KVS

RDBMS

Scalability

High functionality

Mio

+Range search

Makes RDBMSlighter workload

Page 10: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Range search?

Querieslast 7 daysprev/nextTop 10 ranking

SQLSELECT * FROM photos WHERE date between xxx

and xx order by date limit 10

RDBMS handles these queires

7

Page 11: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Mio

8

Page 12: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

The Challenges and Design Decisions

Range searchOrdered structureSkip Graphs algorithm

Scale-Outdistributed using Erlang functions

memcached compatible I/FVolatile

keep it simple

9

Page 13: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Skip GraphsJames Aspnes (2003)

10

Page 14: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Supported operations

search by keyinsert (join)removerange search by key1 and key2

11

Page 15: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Set of sorted doubly linked lists

12

Same as railway stationsAll keys (stations) consist doubly linked listKnows only his left and right stationKeep sorted by key

Search Shibuya start from UenoGo to left. O(n)

Shibuya Shinjuku UenoTamachi Yoyogi

Page 16: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Make an express lane

13

Skip some stationsUeno -> Shinjuku -> ShibuyaTamachi is placed on another express

lane

Shibuya Shinjuku UenoTamachi Yoyogi

Shinjuku Ueno

Express

Local

Skip

Page 17: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Multiple lanes

Level 0 laneall keys are in the list

Level n (n > 0) laneexpress lanen + 1 lane is more express than n lane.

14

Shibuya Shinjuku UenoTamachi Yoyogi

Level 0

Level 1

Level 2

Page 18: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Search

Start from highest to lower levelCan search from any stationsO(log n)

15

Shibuya Shinjuku UenoTamachi Yoyogi

Level 0

Level 1

Level 2

Page 19: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Range Search

Search key1Collect matched on Level 0ex. Key1 = Ueno , Key2 = Shibuya

16

Shibuya Shinjuku UenoTamachi Yoyogi

Level 0

Level 1

Level 2

Page 20: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Remove

Remove on each LevelUpdate neighbor’s linksHighest to lower

17

A B C A C

B

Page 21: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Insert

Insert on each LevelUpdate neighbors’s linksLowest to higher (in reverse order to remove)In which express lane is a new station insereted?

radomly locateduniform

18

A C

B

A B C

Page 22: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Easy to implement?

NoReally simple, but ...We should support concurrent insert/remove

If neighbor is removed when inserting?If someone inserts another to neighbor?Searching crash?

Fragile linked list We can’t find any perfect concurrent join

algorithm.

19

Page 23: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Our concurrent algorithm

Lock some nodesPlease read the source code :)Defined three invariants

20

A B C

A

B

C A

B

C

Page 24: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Implementation

21

Page 25: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Written in Erlang

A station(key, value) is a processgen_server processHold left/right on each levelFollow left/right = gen_server:call/2No distinction between local and remote process

Erlang is great!

Ditributed with -name optionerl -name name@FQDN

22

Page 26: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Performance

5000 qps on single nodereally slow on multiple nodes

need less communication between nodesneed better algorithm

23

Page 27: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Demo

24

Page 28: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Tips for practical Erlang

Max process option +PSet proper value. Don’t use MAX.

gerbage_collect()Fast enough, reduce memory usage.hibernate is slow...

refactorerlfprof on gen_server shows nothing

Use dynomite profile

25

Page 29: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Tips for practical Erlang

Common testCoverageload test

gen_server:call is slowUse mnesia for property access.Easy replication

Easy to runShould users run erl with many options?Shell script borrowed from RabbitMQ

26

Page 30: Mio - a distributed Skip Graph based orderd KVS

Feb 26 2010 Mio - a Skip Graph based ordered KVS

Summary, Once more

Mio is...a distributed orderd KVSmemcached + range searchSkip Graph basedWritten in Erlanghttp://github.com/higepon/mioIn alpha quality

27