mio - a distributed skip graph based orderd kvs
DESCRIPTION
Mio is...a distributed orderd KVSmemcached + range searchSkip Graph basedWritten in Erlanghttp://github.com/higepon/mioTRANSCRIPT
Mioa distributed Skip Graph based orderd KVS
Cybozu Labs, Inc.Taro Minowa (Higepon)
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Introduce myself
@higeponMona OS
http://www.monaos.org
MoshA fast Scheme interpreter
Outputzhttp://outputz.com
2
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Summary
Mio is...a distributed orderd KVSmemcached + range searchSkip Graph basedWritten in Erlanghttp://github.com/higepon/mioIn alpha quality
3
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Background
4
Feb 26 2010 Mio - a Skip Graph based ordered KVS
RDBMS vs KVS
5
KVS
RDBMS
Scalability
High functionality
set/get
volatile
Transaction
SQL
Feb 26 2010 Mio - a Skip Graph based ordered KVS
RDBMS vs KVS
5
KVS
RDBMS
Scalability
High functionality
set/get
volatile
Transaction
SQL
Complement each other
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Mio
6
KVS
RDBMS
Scalability
High functionality
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Mio
6
KVS
RDBMS
Scalability
High functionality
Mio
+Range search
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Mio
6
KVS
RDBMS
Scalability
High functionality
Mio
+Range search
Makes RDBMSlighter workload
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Range search?
Querieslast 7 daysprev/nextTop 10 ranking
SQLSELECT * FROM photos WHERE date between xxx
and xx order by date limit 10
RDBMS handles these queires
7
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Mio
8
Feb 26 2010 Mio - a Skip Graph based ordered KVS
The Challenges and Design Decisions
Range searchOrdered structureSkip Graphs algorithm
Scale-Outdistributed using Erlang functions
memcached compatible I/FVolatile
keep it simple
9
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Skip GraphsJames Aspnes (2003)
10
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Supported operations
search by keyinsert (join)removerange search by key1 and key2
11
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Set of sorted doubly linked lists
12
Same as railway stationsAll keys (stations) consist doubly linked listKnows only his left and right stationKeep sorted by key
Search Shibuya start from UenoGo to left. O(n)
Shibuya Shinjuku UenoTamachi Yoyogi
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Make an express lane
13
Skip some stationsUeno -> Shinjuku -> ShibuyaTamachi is placed on another express
lane
Shibuya Shinjuku UenoTamachi Yoyogi
Shinjuku Ueno
Express
Local
Skip
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Multiple lanes
Level 0 laneall keys are in the list
Level n (n > 0) laneexpress lanen + 1 lane is more express than n lane.
14
Shibuya Shinjuku UenoTamachi Yoyogi
Level 0
Level 1
Level 2
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Search
Start from highest to lower levelCan search from any stationsO(log n)
15
Shibuya Shinjuku UenoTamachi Yoyogi
Level 0
Level 1
Level 2
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Range Search
Search key1Collect matched on Level 0ex. Key1 = Ueno , Key2 = Shibuya
16
Shibuya Shinjuku UenoTamachi Yoyogi
Level 0
Level 1
Level 2
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Remove
Remove on each LevelUpdate neighbor’s linksHighest to lower
17
A B C A C
B
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Insert
Insert on each LevelUpdate neighbors’s linksLowest to higher (in reverse order to remove)In which express lane is a new station insereted?
radomly locateduniform
18
A C
B
A B C
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Easy to implement?
NoReally simple, but ...We should support concurrent insert/remove
If neighbor is removed when inserting?If someone inserts another to neighbor?Searching crash?
Fragile linked list We can’t find any perfect concurrent join
algorithm.
19
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Our concurrent algorithm
Lock some nodesPlease read the source code :)Defined three invariants
20
A B C
A
B
C A
B
C
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Implementation
21
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Written in Erlang
A station(key, value) is a processgen_server processHold left/right on each levelFollow left/right = gen_server:call/2No distinction between local and remote process
Erlang is great!
Ditributed with -name optionerl -name name@FQDN
22
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Performance
5000 qps on single nodereally slow on multiple nodes
need less communication between nodesneed better algorithm
23
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Demo
24
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Tips for practical Erlang
Max process option +PSet proper value. Don’t use MAX.
gerbage_collect()Fast enough, reduce memory usage.hibernate is slow...
refactorerlfprof on gen_server shows nothing
Use dynomite profile
25
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Tips for practical Erlang
Common testCoverageload test
gen_server:call is slowUse mnesia for property access.Easy replication
Easy to runShould users run erl with many options?Shell script borrowed from RabbitMQ
26
Feb 26 2010 Mio - a Skip Graph based ordered KVS
Summary, Once more
Mio is...a distributed orderd KVSmemcached + range searchSkip Graph basedWritten in Erlanghttp://github.com/higepon/mioIn alpha quality
27