extend redis with modules
TRANSCRIPT
Extend Redis with ModulesItamar Haber
2
Who We Are
The open source home and commercial provider of Redis
Open source. The leading in-memory database platform, supporting any high performance OLTP or OLAP use case.
Chief Developer Advocate at Redis Labshttp://bit.ly/RedisWatch &&(curat||edit||janit||)[email protected]@itamarhaber
3
“He who can, does;He who cannot,
teaches.”– Bernard Shaw
~10 Things About Redis
5
1.Redis: REmote DIctionary Server2./ rɛdɪs/: “red-iss”3.OSS: http://github.com/antirez/redis4.3-clause BSD-license: http://redis.io5. In-memory: (always) read from RAM6.A database for: 5 data structures7.And: 4 (+1) more specialized ones
6
8.Developed & maintained: (mostly) Salvatore Sanfilippo (a.k.a. @antirez) and his OSS team at @RedisLabs
9.Short history:v1.0 August 9th, 2009 … v3.2 May 6th, 2016
10.“The Leatherman™ of Databases”:mostly used as a DB, cache & broker
7
11.A couple or so of extra features:(a) atomicity; (b) blocking wait;(c) configurable persistence;(d) data expiration and (e) eviction; as well as transactions, PubSub, Luascripts, high availability & clustering
12.Next version (v4.0): MODULES!
8
Why Redis
Simplicity VersatilityPerformance
“it is very fast”Next 3 slides
+ ‘demo’while(!eof)
9
Redis 1011. Redis is “NoSQL”0. No (explicit) schema, access by key1. Key -> structure -> data
SIMPL-ICI-TY: simple, I see it, thank you
10
Redis data stratav1.0 Strings
ListsSets
v1.2 Sorted Setsv2.0 Hashes
v2.2 Bit arraysv2.8.9 HyperLogLogv3.2 Geo Sets
Bit fieldsv4 Streams (?)
MODULES!
11
How to Redis in 3 steps:1. 147 OSS clients in 49 languages, e.g:
Java, Node.js, .NET, Python, Ruby…2. You make a request, i.e.:
PING3. The server replies, i.e.g:
PONG
12
~$ redis-cli127.0.0.1:6379> SET counter 1OK127.0.0.1:6379> GET counter"1"127.0.0.1:6379> INCRBY counter 1(integer) 2127.0.0.1:6379> APPEND counter b||!2b(integer) 7127.0.0.1:6379> GETSET counter "\x00Hello\xffWorld""2b||!2b"127.0.0.1:6379>
The Evolution of Versatility
14
Flexibility: model (almost) anything with basic “building blocks” and simple rules (v0.0.1)Composability: transactions (v1.2) and server-embedded scripted logic (v2.6)Extensibility: modules (v4) for adding custom data structures and commands
MODULES! (a.k.a plugins)
16
First mentioned in release v1.0
https://groups.google.com/forum/#!msg/redis-db/Z0aiVSRAnRU/XezAFFtgyPUJ
“Another interesting idea is to add support for plugins implementing specific commands and associated data types, and the embedding of a scripting language.”
17
Redis before modules:1. Redis is ubiquitous for fast data, fits
lots of cases (Swiss™ Army knife)2. Some use cases need special care3. Open source has its own agenda
So what can you do? FR, PR or fork
18
Redis with modules:1. Core still fits lots of cases2. Module extensions for special cases3. A new community-driven ecosystem4. “Give power to users to go faster”
What to expect? Nothing’s impossible!
19
Redis modules are:1. Dynamically (server-)loaded libraries2. Future-compatible3. (will be mostly) written in C4. (nearly) as fast as the core5. Planned for public release Q3 2016
20
Modules let you:1. Process: where the data is at2. Compose: call core & other modules3. Extend: new structures, commands4. (planned) Time & keyspace triggers5. (also) Blocking custom commands6. (and) Cross-cluster parallelization
21
redis> ECHO "Alpha""Alpha"redis> MODULE LOAD example.soOKredis> EXAMPLE.ECHO "Bravo""Bravo"redis> ^C
~$ wc example.c13 46 520 example.c~$ gcc -fPIC -std=gnu99 -c -o example.o example.c~$ ld -o example.so example.o -shared -Bsymbolic -lc
core command
module library
“new” command
Redis Modules API
23
The API1. Where most of the effort was made2. Abstracts & isolates Redis’ internals3. The server’s (C-) binding contract4. Will not be broken once released5. Exposes three conceptual layers
24
Modules API layers1.Operational: admin, memory, disk,
replication, arguments, replies…2.High-level: client-like access to core
and modules’ commands3.Low-level: (almost) native access to
core data structures memory
~$ cat example.c: operational-API-only example
26
#include "redismodule.h"int Echo(RedisModuleCtx *ctx,
RedisModuleString **argv, int argc) {if (argc != 2) return RedisModule_WrongArity(ctx);return RedisModule_ReplyWithString(ctx,argv[1]); }
int RedisModule_OnLoad(RedisModuleCtx *ctx) {if (RedisModule_Init(ctx, "example", 1,
REDISMODULE_APIVER_1) == REDISMODULE_ERR)return REDISMODULE_ERR;
if (RedisModule_CreateCommand(ctx, "example.echo",Echo, "readonly", 1, 1, 1) == REDISMODULE_ERR)return REDISMODULE_ERR;
return REDISMODULE_OK; }
27
#include "redismodule.h"
int RedisModule_OnLoad(RedisModuleCtx *ctx) {
MUST:API def initions
MUST:is called when module is loaded
pointer to context
28
RedisModuleCtx *ctx1. The module’s call execution context2. Used by most calls to the API, just
pass it along3. A black box: internal housekeeping
structure for tracking memory allocations, objects, opened keys…
29
if (RedisModule_Init(ctx, "example", 1,REDISMODULE_APIVER_1) == REDISMODULE_ERR)return REDISMODULE_ERR;
if (RedisModule_CreateCommand(ctx, "example.echo",Echo, "readonly", 1, 1, 1) == REDISMODULE_ERR)return REDISMODULE_ERR;
register thecommand
register the moduleor die trying
30
int Echo(RedisModuleCtx *ctx,RedisModuleString **argv, int argc)
if (argc != 2) return RedisModule_WrongArity(ctx);return RedisModule_ReplyWithString(ctx,argv[1]);
validate number of arguments
&err if needed
arguments&count
send backthe argument
31
RedisModule_ReplyWith• Error – duh• Null – no words• LongLong – integer• String – also Simple and Buffer• Array – Redis array (can be nested)• CallReply – High-Level API reply
High-Level API
33
RedisModule_Call(…)• Does: runs a command • Expects: context, command name,
printf-like format and arguments• Returns: RedisModuleCallReply *
• Not unlike: Redis’ Lua redis.call
35
int Educational_HighLevelAPI_Echo(RedisModuleCtx *ctx,RedisModuleString **argv, int argc) {if (argc != 2) return RedisModule_WrongArity(ctx);RedisModule_AutoMemory(ctx);RedisModuleCallReply *rep = RedisModule_Call(ctx,
"ECHO", "s", argv[1]);return RedisModule_ReplyWithCallReply(ctx, rep);
}
Using the High-Level API to call
the Redis core ‘ECHO’ command...
...is impractical but educational :)
36
RedisModule_AutoMemory(…)Automagically manages memory• RedisModuleCallReply *• RedisModuleString *• RedisModuleKey *
• RedisModule_Alloc() and family
High-Level Visualization Of The Low-Level API
38
user app
Redis client
Redis
core
data
GET foo
"bar"
101010
010101
101010
39
user
101010
010101
101010
High
level
API
appmodule
40
user
101010
010101
101010
app
Low
level
API
41
With the low-level API you can:• Manage keys: open, close, type,
length, get/set TTL, delete…• Manipulate core data structures:
e.g. RedisModule_StringSet(…), RedisModule_ListPop(…) and RedisModule_Zset*Range(…)
42
• Fine tune replication: RedisModule_Replicate*(…)
• Directly access String memory: RedisModule_StringDMA(…)
• Register custom data types:RedisModule_CreateDataType(…)
• And much more but…
43
Build it• Get Redis unstable version• Read the docs• You can also use the Redis Labs
Modules SDK to jumpstart: https://github.com/RedisLabs/RedisModulesSDK
• …
The Benchmark (Why Bother with Modules?)
45
1.2 1.251.05
0.1
seconds
Time needed
for summing
1,000,000
Sorted Set
scores Python
(local)
Lua API
high low
46
On averageabout 63.79%of all statisticsare made up
Probabilistic Data Structures (PDSs)
48
There are three kindsof people in the world;those who can countand those who can’t.
49
There are three kindsof data structures…
…and those who bothcan count and can’t.
50
Data Structures of the 3rd kind• Why: accuracy is (in theory) possible
but scale makes it (nearly) impossible• Example: number of unique visitors• Alternative: estimate the answer• Data structure: the HyperLogLog• Ergo: modules as models for PDSs
51
The “good” PDSs are1. Efficient: sublinear space-time 2. Accurate: within their parameters3. Scalable: by merging et al.4. Suspiciously not unlike: the Infinite
Improbability Drive (The Hitch Hiker Guide to the Galaxy, Adams D.)
52
Top-K - k most frequent samplesThe entire algorithm:1. Maintain set S of k counters2. For each sample s:2.1 If s exists in S, increment S(x)2.1 Otherwise, if there’s space add x
to S , else decrement all counters
53
Modeling Top-K with Redis1. Sorted Set -> unique members2. Member -> element and score3. ZSCORE: O(1) membership4. ZADD: O(Log(N)) write5. ZRANGEBYSCORE: O(Log(N)) seek
Can this be moduled?
54
redis> TOPK.ADD tk 2 a(integer) 1redis> TOPK.ADD tk 2 b(integer) 1redis> TOPK.ADD tk 2 b(integer) 0redis> ZRANGE tk 1 -1 WITHSCORES1) "a"2) "1"3) "b"4) "2"redis> TOPK.ADD tk 2 c(integer) -1
max items (a.k.a k)the sample
score
1 means added0 is freq. incr.
indicates eviction
55
redis> ZRANGE tk 1 -1 WITHSCORES1) "b"2) "2"3) "c"4) "2"redis> TOPK.ADD tk 2 c(integer) 0redis> ZRANGE tk 1 -1 WITHSCORES1) "b"2) "2"3) "c"4) "3"
a evicted, c addedb’s and c’s score = 2
(global offset = -1)
56
topk Redis Module1.Optimization: a global score offset2.Eviction: reservoir sampling3.TOPK.PRANK: percentile rank4.TOPK.PRANGE: percentile range5.Where: Redis Module Hub/topk
57
Bloom filter – set membership1.Answers: “have I seen this?”2.Good for: avoiding hard work3.Promises: no false negatives4.Sometimes: false positives (error)5.Gist: hash values of the samples are
indexes in an array of counters
58
redis> CBF.ADD bloom a(integer) 1redis> CBF.ADD bloom b(integer) 2redis> CBF.CHECK bloom a(integer) 1redis> CBF.CHECK bloom b(integer) 1redis> CBF.CHECK bloom c(integer) 0
0 1 0 21 0
h1(a), h2(a)
h1(b), h2(b)h1(c), h2(c)
59
“Good coders code,great coders reuse.”
– Peteris Krumins
60
“Good programmers know what to write. Great ones
know what to rewrite (and reuse)”– Eric S. Raymond
61
git clone bit.ly/dablooms
62
redablooms Redis Module1.Error rate: defaults to %52.Counting: 4-bit registers, allows
removing samples, default capacity is 100,000 samples
3.Scalable: multiple filters layered4. Redis Module Hub/redablooms
63
Count Min Sketch - item counts1.Unlike Top-K:
answers about any sample2.WRT Bloom filters
Like: hashes as indexes to countersUnlike: array per hash function, returns the minimum of counters
64
redis> CMS.INCRBY count a 1 b 2OKredis> CMS.QUERY count b(integer) 2
0 1 0 00 2 h1
0 0 0 03 0 h2
collision
min[h1(b), h2(b)]hi(b) hi(b)
65
countminsketch Redis Module1.Registers width: 16-bit2.Default maximum error: %0.013.Default error probability: %0.014. Redis Module Hub/countminsketch
66
tdigest Redis Module1.Purpose: streaming quantiles2.Beauty: tiny, fast and parallelizable3.Award: 1st community data type4.Author: Usman Masood5. Redis Module Hub/tdigest
67
redismodules.com: Redis Module Hub
68
What Is The Hub
1.Modules developed by: anyone2.Certified by: Redis Labs3.Licenses: Open Source & Commercial4. (will be) Distributed via: Redis Cloud
and Redis Labs Enterprise Cluster5.Where: redismodules.com
Thank you
Further Reading
71
1. The Redis Open Source Project Website – http://redis.io2. Redis source code on GitHub – http://github.com/antirez/redis3. Getting started:
1. An introduction to Redis data types and abstractions –http://redis.io/topics/data-types-intro
2. Try Redis (in your browser) – http://try.redis.io3. Karl Seguin’s The Little Redis Book –
http://openmymind.net/2012/1/23/The-Little-Redis-Book/4. Josiah Carlson’s Redis In Action – https://redislabs.com/ebook/redis-in-
action
4. Redis documentation – http://redis.io/documentation5. Redis commands – http://redis.io/commands6. Redis community – http://redis.io/community7. Redis Watch newsletter – https://redislabs.com/redis-watch-archive
72
8. STREAM data structure for Redis: let's design it together! –https://www.reddit.com/r/redis/comments/4mmrgr/stream_data_structure_for_redis_lets_design_it/
9. Redis Loadable Modules System – http://antirez.com/news/10610. Introduction to Redis Modules API –
https://github.com/antirez/redis/blob/unstable/src/modules/INTRO.md
11. Redis Modules API reference –https://github.com/antirez/redis/blob/unstable/src/modules/API.md
12. Creating a redis Module in 15 lines of code! –https://gist.github.com/dvirsky/83fc32366d5ad82fc3dca47ed2704377
73
13. Infinite Improbability Drive –https://en.wikipedia.org/wiki/Technology_in_The_Hitchhiker%27s_Guide_to_the_Galaxy#Infinite_Improbability_Drive
14. Streaming Algorithms: Frequent Items –https://people.eecs.berkeley.edu/~satishr/cs270/sp11/rough-notes/Streaming-two.pdf
15. Space/Time Trade-offs in Hash Coding with Allowable Errors –http://dmod.eu/deca/ft_gateway.cfm.pdf
16. Approximating Data with the Count-Min Data Structure –http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf
17. Computing Extremely Accurate Quantiles Using T-Digests –https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf