cassandra materialized views

Post on 17-Feb-2017

5.148 Views

Category:

Software

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Materialized ViewsCarl Yeksigian

What are Materialized Views?

• Two copies of the data using different partitioning and placed on different replicas

• Automated, server-side denormalization of data• Native Cassandra read performance• Write penalty, but acceptable performance

Basic Rules of Data Modeling, Refresher

• Best practice: Denormalization• Start by understanding the queries you need• Create a table for each query

Why is Denormalization Hard?

• Implemented by every application• No guarantees on performance or consistency• Updates to existing rows require cleanup, read-before-write

Denormalization Example: User Playlists

Queries• All Songs for a given

playlist• Track Users who like

the same Artist• Find most recently

played song

Denormalization in Practice

CREATE TABLE playlists(user_name text, playlist_name text, song_id text, artist_name text, last_played timestamp)

SELECT song_id FROM playlistsWHERE user_name=‘carl’ AND playlist_name=‘jams’

SELECT COUNT(song_id) FROM playlistsWHERE artist_name=‘Weezer’

SELECT last_played, song_id FROM playlistsWHERE user_name=‘carl’ AND playlist_name=‘jams’ORDER BY last_played DESC

Denormalization in Practice

CREATE TABLE playlists(user_name text, playlist_name text, song_id text, artist_name text, last_played timestamp, PRIMARY KEY (user_name, playlist_name, song_id))

CREATE TABLE artists_to_playlists(artist_name text, user_name text, playlist_name text, song_id text, PRIMARY KEY (artist_name, user_name, playlist_name, song_id))

Denormalization in Practice

CREATE TABLE last_played(user_name text, playlist_name text, last_played timestamp, song_id text, PRIMARY KEY (user_name, playlist_name, last_played, song_id))

CLUSTERING ORDER BY (last_played DESC)

Denormalization in Practice: Inserts

BEGIN BATCHINSERT INTO playlists (user_name, playlist_name, song_id, artist_name, last_played) VALUES (‘carl’, ‘jams’, ‘Undone’, ‘Weezer’, ‘2015-09-24 09:00’);

INSERT INTO artists_by_playlist (artist_name, user_name, playlist_name, song_id) VALUES (‘Weezer’, ‘carl’, ‘jams’, ‘Undone’);

INSERT INTO last_played (user_name, playlist_name, last_played, song_id) VALUES (‘carl’, ‘jams’, ‘Undone’, ‘2015-09-24 09:00’);

APPLY BATCH;

Denormalization in Concept: Updates

UPDATE playlists SET last_updated=now()WHERE user_name=‘carl’ AND playlist_name=‘jams’ AND song_id=‘Undone’

DELETE FROM playlistsWHERE user=‘carl’

Manual Denormalization with updates

Client Batchlog

Base Table

View Table

Coordinator

Manual Denormalization with updates

Client Batchlog

Base Table

View Table

CoordinatorQuery Existing Data

Manual Denormalization with updates

Client Batchlog

Base Table

View Table

CoordinatorQuery Existing Data

Query Existing Data

Manual Denormalization with updates

Client Batchlog

Base Table

View Table

CoordinatorQuery Existing Data

Query Existing Data Return Existing Data

Return Existing Data

Manual Denormalization with updates

Client Batchlog

Base Table

View Table

CoordinatorWrite New Values

Manual Denormalization with updates

Client Batchlog

Base Table

View Table

CoordinatorWrite New Values Write New Values

Manual Denormalization with updates

Client Batchlog

View Table

CoordinatorWrite New Values Write New Values

Base Table

Write New Values

Manual Denormalization with updates

Manual Denormalization Limitations

• Updates need to keep view in sync, including tombstoning previous values

• How to keep the view and base in sync on failure?• Contentious updates can potentially cause extra values• Your application doesn’t always know what is a update or an

insert (i.e. upsert)

Manual Denormalization: Contentious Updates

Client 1

Cassandra

Client 2playlists:(‘carl’, ‘jams’, ‘Undone’, 2015-09-24 9:00)last_played:(‘carl’, ‘jams’, 2015-09-24 9:00, ‘Undone’)

Query existing last_played

Query existing last_played

Manual Denormalization: Contentious Updates

Client 1

Cassandra

Client 2playlists:(‘carl’, ‘jams’, ‘Undone’, 2015-09-24 9:02)last_played:(‘carl’, ‘jams’, 2015-09-24 9:01, ‘Undone’)last_played:(‘carl’, ‘jams’, 2015-09-24 9:02, ‘Undone’)Update last played time to 9:02

last_played: 2015-09-24 9:00

last_played: 2015-09-24 9:00

Update last played time to 9:01

Manual Denormalization Limitations

Materialized Views

• Provide automated server-side denormalization• No need for read-before-write on the application side• Simplify application code• Provide safer guarantees

Materialized Views: Guarantees

• If a write is acknowledged, at least CL number of base and view replicas will receive the write

• If a write is actually an update, the previous value will be cleaned up in the view

• Even with contentious updates, view synchronized with base for each update

• Takes care of deletions properly• When a base table is repaired, the data will also be inserted

into the view• TTL’d base data will remain TTL’d in view

Why Not Just Use Secondary Indexes?

• We can get most of the same functionality by using secondary indexes

• Secondary indexes query each node, not being able to use the ring

• On a node, not a single access

Secondary Indexes: Query Pattern

Client

Secondary Indexes: Query Pattern

Client

Secondary Indexes: Query Pattern

Client

Materialized Views: Query Pattern

Client

Materialized Views in Practice

CREATE TABLE playlists(user_name text, playlist_name text, song_id text, artist_name text, last_played timestamp, PRIMARY KEY (user_name, playlist_name, song_id))

Materialized Views in Practice

CREATE MATERIALIZED VIEW artist_to_user ASSELECT song_id, user_nameFROM playlistsWHERE song_id IS NOT NULLAND playlist_name IS NOT NULLAND user_name IS NOT NULLAND artist_name IS NOT NULLPRIMARY KEY (artist_name, user_name, playlist_name, song_id)

Replica Placement

user_name:carl

Replica Placement

artist_name:Weezer

user_name:carl

Materialized Views in Practice

• On creation, a new materialized view will be populated with existing base data

• Each node tracks the completion of the build independentlySELECT *FROM system.built_viewsWHERE keyspace=‘ks’ AND view_name=‘view’

Materialized Views in Practice

CREATE MATERIALIZED VIEW last_played ASSELECT last_played, song_idFROM playlistsWHERE user_name IS NOT NULLAND last_played IS NOT NULLAND song_id IS NOT NULLPRIMARY KEY (user_name, playlist_name, last_played, song_id)CLUSTERING ORDER BY (last_played DESC)

Materialized Views: Performance

• https://github.com/tjake/mvbench• Uses java-driver to simulate MV and manual denormalization

Materialized Views: Performance (ops/s)

Manual Denormalization

Materialized Views

Materialized Views: Performance (p95 latency)

Manual Denormalization

Materialized Views

• Adding WHERE clause support (#9664)CREATE MATERIALIZED VIEW carls_last_played ASSELECT last_played, song_idFROM playsWHERE user_name=‘carl’PRIMARY KEY (last_played, song_id)

• Knowing when a view is completely finished building without querying each node (#9967)

• Insert only tables can skip read-before-write, lock acquisition (#9779)

Materialized Views: Future Features

write<p1, p2, c1, c2, v1>

Node A Node B Node C

Coordinator

Base

View

Node D

write<p1, p2, c1, c2, v1>

del<v0, p1, p2, c1, c2>

Client

write<v1, p1, p2, c1, c2>

BL

Materialized Views: Under the Hood

• If update is partial, we will reinsert data from the read when generating a new row

• If no tombstone generated, only new columns are written to view

Materialized Views: The Edge

• Materialized Views have different failure properties than the rest of the system

• Data from a single base replica can be on many view replicas

Materialized Views: The Edge

• When data is lost on all replicas of the base table, it can not be cleaned up in the view (#10346)

• No read repair between the base and the view table• Repair of base table will clean up view

• Requires local read-before-write.• If you will never ever update/delete use manual MVs

Materialized Views: The Edge

• An update from a base table is asynchronously applied to the view, so it is possible there will be a delay

• A MV on a low-cardinality table can cause hotspots in the ring, overloading some nodes

Thanks

top related