![Page 1: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/1.jpg)
October 18, 2013#CassandraEU
Playlists at Spotify
Using Cassandra to store version controlled objects at large scale
Jimmy Mårdell <[email protected]>
![Page 2: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/2.jpg)
#CassandraEUIntro
About me
• Jimmy Mårdell• Software Engineer• 3 years at Spotify
2
![Page 3: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/3.jpg)
#CassandraEUIntro
About Spotify
• 24 million active users– 6 million paying subscribers
• 4 000 servers in 4 data centers•Over 1 billion playlists created
3
![Page 4: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/4.jpg)
#CassandraEUIntro
Contents
4
•Why version control?•Playlists at Spotify•Cassandra data model•Lessons learned
![Page 5: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/5.jpg)
#CassandraEUWhy version control?
What is version control?
• “Version control is the management of changes to documents” (Wikipedia)• Stand-alone (most common)– GIT, Subversion etc
• Embedded– Google Docs
5
![Page 6: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/6.jpg)
#CassandraEUWhy version control?
Embedded usage
• Collaborative editing• Undo functionality• Performance• Business logic depends on document history
6
![Page 7: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/7.jpg)
#CassandraEUPlaylists at Spotify
Playlists
7
![Page 8: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/8.jpg)
#CassandraEUPlaylists at Spotify 8
![Page 9: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/9.jpg)
#CassandraEUPlaylists at Spotify 9
Playlist challenges
•More than 1 billion playlists• >40 000 requests/second at peak•Offline mode• Concurrent changes
![Page 10: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/10.jpg)
#CassandraEUPlaylists at Spotify
Playlist client-server
• Every playlist is a version controlled object• All playlists are synced on login– Fetch all new changes
10
![Page 11: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/11.jpg)
#CassandraEUPlaylists at Spotify
Playlist client-server
• Local queue of playlist modifications– Clients optimistically accept changes - fast UI
•Queue flushed to server when possible– Offline changes– Fault tolerant
11
![Page 12: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/12.jpg)
#CassandraEUPlaylists at Spotify
Playlist version control
12
1,4ed2...: ADD(ix=0, track=A,B,C)
2,19ca...: MOV(from=2, to=1, len=1)
3,038f...: REM(from=2, len=1)
0,ROOT
ABC
AC B
AC
Representation of a playlist in the backend
2,19ca...: MOV(from=2, to=1, len=1)2,19ca...: MOV(from=2, to=1, len=1)
![Page 13: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/13.jpg)
#CassandraEUPlaylists at Spotify
Playlist branching
• Concurrent changes– Offline
13
BA
![Page 14: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/14.jpg)
#CassandraEUPlaylists at Spotify
Playlist branching
• Concurrent changes– Offline
• Conflict resolution– Operational Transformation
• Clients oblivious of branches
14
B
B’
A
A’
merge
![Page 15: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/15.jpg)
#CassandraEUCassandra data model
Cassandra data model
15
![Page 16: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/16.jpg)
#CassandraEUCassandra data model
Cassandra at Spotify
• Playlist first system to use Cassandra– Now we use it a lot...
• Started with Cassandra 0.7• Using limited set of Cassandra features– No super columns– No CQL
16
![Page 17: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/17.jpg)
#CassandraEUCassandra data model
Planning a data model
• Start with the queries!• Three common playlist queries– SYNC: Get all changes since a particular revision– GET: Get the most recent snapshot– APPEND: Add/move/delete tracks
17
![Page 18: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/18.jpg)
#CassandraEUCassandra data model
Playlist data model
18
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
parent=0,ROOTop=ADD(ix=0, track=A,B,C)
parent=1,4ed2...op=MOV(from=2, to=1, len=1)
parent=2,19caop=REM(from=2, len=1)
CF playlist_change
![Page 19: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/19.jpg)
#CassandraEUCassandra data model
Playlist data model
19
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
parent=0,ROOTop=ADD(ix=0, track=A,B,C)
parent=1,4ed2...op=MOV(from=2, to=1, len=1)
parent=2,19caop=REM(from=2, len=1)
CF playlist_change
Row key 1,8a20... 2,b783... 2,dd07... 3,39ef... 3,5a9c... 4,03fc...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prnt=0,ROOTop=...
prnt=1,8a20...op=...
prnt=1,8a20...op=...
prnt=2,dd07...op=...
prnt=2,b783...op=...
prnt=2,39ef...prnt=3,5a9c...
![Page 20: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/20.jpg)
#CassandraEUCassandra data model
Playlists in Cassandra
•Which revision is the latest?– Changes with no children
•Multiple heads possible!– Heads may appear anywhere within the row
20
![Page 21: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/21.jpg)
#CassandraEUCassandra data model
Playlist data model
21
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
CF playlist_head
![Page 22: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/22.jpg)
#CassandraEUCassandra data model
Playlist data model
22
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 1,8a20... 2,b783... 2,dd07...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prnt=0,ROOTop=...
prnt=1,8a20...op=...
prnt=1,8a20...op=...
Row key 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
CF playlist_head
Row key 2,b783... 2,dd07...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
![Page 23: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/23.jpg)
#CassandraEUCassandra data model
Playlist data model
23
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 1,8a20. 2,b783. 2,dd07. 3,39ef. 3,5a9c. 4,03fc.
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prt=0,ROOTop=...
prnt=1,8a20op=...
prnt=1,8a20op=...
prnt=2,dd07op=...
prnt=2,b783op=...
prnt=2,39efprnt=3,5a9c
Row key 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
CF playlist_head
Row key 4,03fc...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
![Page 24: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/24.jpg)
#CassandraEUCassandra data model
Playlist heads
• playlist_head is a small CF– Fits in RAM
• 95% of playlist request only read from playlist_head– Most playlists are already up-to-date
24
![Page 25: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/25.jpg)
#CassandraEUCassandra data model
Playlist snapshots
• playlist_change works well when syncing playlists• Not so well for fetching new playlists– Snapshot cache
25
![Page 26: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/26.jpg)
#CassandraEUCassandra data model
Playlist data model
26
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 1,8a20... 2,b783... 2,dd07...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prnt=0,ROOTop=...
prnt=1,8a20...op=...
prnt=1,8a20...op=...
Row key cache
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
version=3,038f...contents=A,C
CF playlist_snapshot
Row key cache
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
version=2,b783...contents=...
![Page 27: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/27.jpg)
#CassandraEUCassandra data model
Updating playlists
• Validate change– Locate snapshot– Client may append to old version
• Update all tables– playlist_head last
27
![Page 28: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/28.jpg)
#CassandraEUCassandra data model
Cassandra consistency levels
• Replication factor 3• All writes using CL_QUORUM• Reads from playlist_head – CL_QUORUM
• Reads from playlist_change and playlist_snapshot– CL_ONE but may fallback to CL_QUORUM
28
![Page 29: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/29.jpg)
#CassandraEULessons learned
Lessons learned
29
![Page 30: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/30.jpg)
#CassandraEULessons learned 30
•Leveled compaction
– Improved performance a lot•Compression
–Not as impressive
–CRC checks
Optimizations
![Page 31: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/31.jpg)
#CassandraEULessons learned
Optimizations
• Trusted Linux page cache to ensure playlist_head kept in RAM– Didn’t work
• Tried Cassandra row cache– NO!
•mlock to the rescue
31
![Page 32: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/32.jpg)
#CassandraEULessons learned
An enterprise ready solution
bash# while true; do vmtouch -m 10000000000 -l *head* & sleep 10m kill %vmtouchdone
32
![Page 33: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/33.jpg)
#CassandraEULessons learned
No moving parts
• Flash disks are awesome• Reduced size of cluster from 60 to 30 nodes– Thanks FusionIO!
• IOPS no longer the bottleneck
33
![Page 34: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/34.jpg)
#CassandraEULessons learned
Tombstone hell
• Noticed requests to playlist_head took several seconds– Huh?
• Every change causes a value to be deleted in playlist_head• playlist_head is essentially a queue– Well-known anti-pattern
34
![Page 35: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/35.jpg)
#CassandraEULessons learned
Tombstone hell
•We had rows with >500,000 tombstones• Solution: major compaction– Relatively fast since playlist_head is in RAM
35
![Page 36: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/36.jpg)
#CassandraEULessons learned
And more...
• Large rows in playlist_change– Modify version graph
• Reduce amount of requests– Group playlists by owner
Sounds interesting? We’re hiring!
36
![Page 37: Playlists at Spotify - Using Cassandra to store version controlled objects](https://reader036.vdocument.in/reader036/viewer/2022081800/587138ee1a28abf0568b64c5/html5/thumbnails/37.jpg)
Questions?