playstation and cassandra streams (alexander filipchik & dustin pham, sony) | c* summit 2016

PlayStation and Cassandra Streams

Cassandra Summit 2016

Who are we? Alexander Filipchik (PSN: LaserToy)

Principal Software Engineer at Sony Interactive Entertainment

Dustin Pham (PSN: quibfan)Principal Software Engineer at Sony Interactive Entertainment

Agenda• Multi-regional deployment problem• Proving C* replications will work for us• Designing a Test System• Cassandra Streams as a result

How it all startedWe want

Multiple regions and always on

A lot of unknowns• Will it work?• Will performance degrade?• How eventual is multiregion eventual

consistency?• Will we hit any roadblocks?• Well, how many roadblocks will we hit?

What did we know?Netflix is doing it and they actually tested it:• They wrote 1M records in one region of a

multi-region cluster• 500ms later read in other clusters was

initiated• All records were successfully read

Well…Some questions to answer:

Should we just trust the Netflix’s results and just replicate data and see what happens?

Is their experiment applicable to our situation?

Can we do better?

Some wants:• Track replication latencies between regions• Use close to production traffic (both load and data)• Write/read in all the regions in the same time• To be able to simulate different disruptions• To have a reusable system we can use to test our

future Cassandra deployments• Do it in one month

Tracking latenciesTo track latencies we need to record some information when message arrives on a specific node:

17:06:52 Received from DC1, R1: update KS Test CF test K 1000 C hello Size 76 Timestamp 1456333612729000 at 1456333612735000. Diff is: 6000

17:06:53 Received from DC2, R1: update KS Test CF test K 1000 C hello Size 76 Timestamp 1456333613344000 at 1456333613345000. Diff is: 10000

Using real data• We need something we can use as a buffer so

we can store prod size data in there and then replay when we want

Results

Bad Idea

We need a way to store all the latencies and something to analyze the results

Putting everything together.

Preparation

Exporter

Thrift

CQL

Thrift

JSON

Region 1

Region 2

Ingester

Ingester

Test

Read/Write Loader

Region 1

Read/Write Loader

Region 2

Analysis

How did we extract latencies?

Just injected code here and there

Messaging Service Keyspace CommitLog Memtable Etc.

Fire Async event

Store Context

info

Use

Write

StorageProxy

Example or results

10

100

1000

10000

100000

1000000

10000000Two DC connection cut-off and recovery ( latency in logarithmic scale)

Pct95 Pct99

Pct999 MaxLag

Looking at the Bigger Picture

What now?• The information gathered through tests were

extremely useful but also not easily reachable in Cassandra’s current state

• Could we somehow ’tap’ off of Cassandra’s data streams?

Cassandra streams

queues logs metrics

Why????• Why not use triggers?• Why not put data routing ahead of

Cassandra?• Wouldn’t this cause a performance impact?• Wouldn’t this result in data bloat somewhere

else?

Knowing what happens at different points can power different use cases

message storage keyspace Commitlog memtable etc111011010110

001001000100

111011010110001001000100

111011010110001001000100

111011010110001001000100

111011010110001001000100

111011010110001001000100

111011010110001001000100

Use Cases• Building personalized search indices• In-place Migrations at Data tier level• Cache invalidation• Building analytic views• Data into read optimized views (transformations)• Smart backups• Disabling Hints, and provide alternative mechanisms• Provide more failure handling possibilities• Production level tests (stress tests)

Tap flow: In-place Migration


001001000100

111011010110001001000100

consumers

Read Schema ATransformWrite Schema B

Tap flow: Alternative Failure


001001000100

111011010110001001000100

Failure!

Replay Log

Hints causing cassandra to die faster

Tap flow: Production load test


001001000100

111011010110001001000100

consumers

Formalizing the previous Cassandra multi regional latency tests into the ‘Streams’ framework

High level framework• Cassandra configuration to enable ’Streams’ per

keyspace– Tap hooks (+after StorageProxy => topic)– Sampling/Throttling capability/circuit breaking– Request Log mode (not recommended) / Kafka mode

• Common interfaces for consumers with common reference implementations

Still a W.I.P.• The ‘Cassandra Streams’ for our Cassandra clusters is still a W.I.P.

and used only for measurement/analysis• Introducing a tap off of the write path introduces a new set of

complexity– Consistency– Paxos– Etc

• However, depending on use-case, it is a useful tool that can be enabled & disabled via configuration

PlayStation is hiring:

hackitects.com