![Page 1: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/1.jpg)
Time is Money
Financial Time Series Jake Luciani and Carl Yeksigian
BlueMountain Capital
![Page 2: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/2.jpg)
About this talk Part 1: Our use case and architecture Part 2: Our deployment and tuning Part 3: Q&A
![Page 3: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/3.jpg)
Know your problem. 1000s of consumers ..creating and reading data as fast as possible ..consistent to all readers ..and handle ad-hoc user queries ..quickly ..across data centers.
![Page 4: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/4.jpg)
Know your data.
AAPL price
MSFT price
![Page 5: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/5.jpg)
Know your queries.
Time Series Query
Start, End, Periodicity defines query
1 minute periods
![Page 6: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/6.jpg)
Know your queries.
Cross Section Query
As Of time defines the query
As Of Time (11am)
![Page 7: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/7.jpg)
Know your queries. Cross sections are random Storing for all possible Cross Sections is not possible. We also support bi-temporality
![Page 8: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/8.jpg)
Let's optimize for Time Series.
![Page 9: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/9.jpg)
CREATE TABLE tsdata ( id blob, property string, asof_ticks bigint, knowledge_ticks bigint, value blob, PRIMARY KEY(id,property,asof_ticks,knowledge_ticks)
) WITH COMPACT STORAGE AND CLUSTERING ORDER BY(asof_ticks DESC, knowledge_ticks DESC)
Data Model (CQL 3)
![Page 10: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/10.jpg)
SELECT * FROM tsdata WHERE id = 0x12345 AND property = 'lastPrice' AND asof_ticks >= 1234567890 AND asof_ticks <= 2345678901
CQL3 Queries: Time Series
![Page 11: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/11.jpg)
CQL3 Queries: Cross Section SELECT * FROM tsdata WHERE id = 0x12345 AND property = 'lastPrice' AND asof_ticks = 1234567890 AND knowledge_ticks < 2345678901 LIMIT 1
![Page 12: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/12.jpg)
A Service, not an app
C*
Olympus
Olym
pus
Olympus
Oly
mpu
s
App
App
App
App
App
App
App
App
App
App
Fat Client
Olympus Thrift Service Olympus Thrift Service
![Page 13: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/13.jpg)
Complex Value Types Not every value is a double Some values belong together (Bid and Ask should always come back together) Thrift structures as values Typed, extensible schema Union types give us a way to deserialize any type
![Page 14: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/14.jpg)
Ad-hoc querying UI
![Page 15: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/15.jpg)
But that's the easy part...
(queue transition)
![Page 16: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/16.jpg)
Scaling... The first rule of scaling is you do not just turn everything to 11.
![Page 17: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/17.jpg)
Scaling... Step 1 - Fast Machines for your workload Step 2 - Avoid Java GC for your workload Step 3 - Tune Cassandra for your workload Step 4 - Prefetch and cache for your workload
![Page 18: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/18.jpg)
Can't fix what you can't measure Riemann (http://riemann.io) Easily push application and system metrics into a single system We push 6k metrics per second to a single Riemann instance
![Page 19: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/19.jpg)
Metrics: Riemann Yammer Metrics with Riemann
https://gist.github.com/carlyeks/5199090
![Page 20: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/20.jpg)
Metrics: Riemann Push stream based metrics library Riemann Dash for Why is it Slow? Graphite for Why was it Slow?
![Page 21: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/21.jpg)
VisualVM: The greatest tool EVER Many useful plugins... Just start jstatd on each server and go!
![Page 22: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/22.jpg)
Scaling Reads: Machines SSDs for hot data JBOD config As many cores as possible (> 16) 10GbE network Bonded network cards Jumbo frames
![Page 23: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/23.jpg)
JBOD is a lifesaver SSDs are great until they aren't anymore JBOD allowed passive recovery in the face of simultaneous disk failures (SSDs had a bad firmware)
![Page 24: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/24.jpg)
Scaling Reads: Cassandra Changes we've made: • Configuration • Compaction • Compression
![Page 25: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/25.jpg)
Leveled Compaction Wide rows means data can be spread across a huge number of SSTables Leveled Compaction puts a bound on the worst case (*) Fewer SSTables to read means lower latency, as shown below; orange SSTables get read
L0
L1
L2
L3
L4
L5
* In Theory
![Page 26: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/26.jpg)
Leveled Compaction: Breaking Bad Under high write load, forced to read all of the L0 files
L0
L1
L2
L3
L4
L5
![Page 27: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/27.jpg)
Hybrid Compaction: Breaking Better Size Tiering Level 0 On by default in 2.0
L0
L1
L2
L3
L4
L5
{ Hybrid
Compaction
Size Tiered
Leveled
![Page 28: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/28.jpg)
Overlapping Compaction Instead of forcing a combination of L0 files with L1, we can just push up files This allows a higher level of concurrency in compactions We still know the SSTables that might contain the keys We can force a proper compaction at any configurable level
L0
L1
L2
L3
L4
L5
![Page 29: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/29.jpg)
C optimized library Read path needs to be fast for our workload CRC check, composite comparison eat a lot of cycles CRC is implemented on chip for some architectures (why not use it?) We want to move some of the operations into a JNI library to reduce latency and improve throughput
![Page 30: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/30.jpg)
Current Stats 16 nodes 2 Data Centers Replication Factor 6 200k Writes/sec at EACH_QUORUM 150k Reads/sec at LOCAL_QUORUM > 30 Million time series > 15 Billion points 10 TB on disk (compressed) Read Latency 50%/95% is 1ms/5ms
![Page 31: C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian](https://reader033.vdocument.in/reader033/viewer/2022042714/554f43d5b4c905423f8b474c/html5/thumbnails/31.jpg)
Questions? Thank you! @tjake and @carlyeks