Download - RedisConf17- Using Redis at scale @ Twitter
![Page 1: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/1.jpg)
Nighthawk
Distributed caching with Redis @
Rashmi Ramesh@rashmi_ur
![Page 2: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/2.jpg)
Agenda
What is Nighthawk?
How does it work?
Scaling out
High availability
Current challenges
![Page 3: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/3.jpg)
Nighthawk - cache-as-a-service
Runs redis at it’s core
> 10M QPS,
Largest cluster runs ~3K redis nodes
> 10TB of data
![Page 4: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/4.jpg)
Who uses Nighthawk?
Some of our biggest customers:
Analytics services - Ads, Video
Ad serving
Ad Exchange
Direct Messaging
Mobile app conversion tracking
![Page 5: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/5.jpg)
Design Goals
Scalable: scale vertically and horizontally
Elastic: add / remove instances without violating SLA
High throughput and low latencies
High availability in the event of machine failures
Topology agnostic client
![Page 6: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/6.jpg)
Nighthawk Architecture
Client
Proxy/Routing layer
Backend N
..……...
Redis 0 Redis N
Backend 0
..……...
Redis 0 Redis N
Topology
Cluster
manager
![Page 7: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/7.jpg)
Cache backend
Mesos Container
Redis nodes
Topology
watcher and
announcer
1 2 3
NM
Proxy/Router
Replica 1 -> Redis1
Replica 2 -> Redis2
Replica 3 -> Redis3
Redis1(dc,host,port1,capacity)
Redis2(dc,host,port2, capacity)
Redis3(dc,host,port3,, capacity)
Topology
![Page 8: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/8.jpg)
Cluster manager
Manages topology membership and changes
- (Re)Balances replicas
- Reacts to topology changes, eg: dead node
- Replicated cache - ensures 2 replicas of same partition are on separate
failure domains
![Page 9: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/9.jpg)
Redis databases for partitions
Partition -> Redis DB
Granular key remapping
Logical data isolation
Enumerating - redis db scan
Deletion - flushdb
Enables replica rehydration
K1 K4K2 K3
Partition X Partition Y
1 2
![Page 10: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/10.jpg)
Scaling
![Page 11: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/11.jpg)
Scaling out with Client/Proxy managed
partitioningKey count: 1.5 M keys
Client
500K 500K500K
![Page 12: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/12.jpg)
Scaling out with Client/Proxy managed
partitioningKey count: 1.5M keys
Remapped keys: 600KClient
300K 300K300K 300K300K
Persistent storage
![Page 13: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/13.jpg)
Scaling out with Cluster managerKey count: 1.5M keys
Partition count: 100
Keys/Partition: 15K
Client
Persistent storage
Proxy
Topology and
cluster manager
500K 500K500K
![Page 14: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/14.jpg)
Scaling out with Cluster managerKey count: 1.5M keys
Partition count: 100
Keys/Partition: 15K
Client
Persistent storage
Proxy
Topology and
cluster manager
500K 485K500K 15K
![Page 15: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/15.jpg)
Scaling out with Cluster managerKey count: 1.5M keys
Partition count: 100
Keys/Partition: 15K
Client
485K 485K500K 15K 15K
Persistent storage
Proxy
Topology and
cluster manager
![Page 16: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/16.jpg)
Scaling out with Cluster manager - Post
balancingKey count: 1.5M keys
Partition count: 100
Post balancing...
Client
Persistent storage
Proxy
Topology and
cluster manager
250K 250K250K 250K 500K
![Page 17: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/17.jpg)
Advantages over Client managed partitioning
- Thin client - simple and oblivious to topology
- Clients, proxy layer and backends scale independently
- Pluggable custom load balancing logic through cluster manager
- No cluster downtime during scaling out/up/back
![Page 18: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/18.jpg)
High Availability
![Page 19: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/19.jpg)
High Availability with Replication
Synchronous, best effort
RF = 2, Intra DC
Supports idempotent operations only - get, put, remove, count, scan
Copies of a partition never on the same host and rack
Passive warming for failed/restarted replicas
![Page 20: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/20.jpg)
High Availability with Replication
Client
Proxy/Routing layer
Backend 0
Partition 2,5,9
Topology
Cluster
manager
GetKey in
Partition 5GetKey in
Partition 5
SERVING
Backend N
Partition
12,5,10
SERVINGFAILED
Backend N*
Partition 12,5,10
WARMING
SetKey in
partition 5
Pool A Pool B
![Page 21: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/21.jpg)
Current challenges
![Page 22: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/22.jpg)
Remember this?
The most retweeted
Tweet of 2014!
![Page 23: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/23.jpg)
Hot key symptom
Significantly high QPS to a single cache server
![Page 24: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/24.jpg)
Hot Key Mitigation
Server side diagnostics:
Sampling a small % of requests and logging
Post processing the logs to identify high frequency keys
Client side solution:
Client side hot key detection and caching
Better to have:
Redis tracks the hot keys
Protocol support to send feedback to client if a key is hot
![Page 25: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/25.jpg)
Active warming of replicas
Client
Proxy/Routing layerTopology
Cluster
managerBackend A
Partition 2,5,9
SERVING
Backend B*
Partition 12,5,10
WARMING
writes
Bootstrapper
Pool APool B
![Page 26: RedisConf17- Using Redis at scale @ Twitter](https://reader033.vdocument.in/reader033/viewer/2022042619/5a649b797f8b9a27568b7617/html5/thumbnails/26.jpg)
Questions?