c* summit eu 2013: the state of cql

20
The State of CQL Sylvain Lebresne (DataStax)

Upload: planet-cassandra

Post on 20-Jun-2015

514 views

Category:

Technology


5 download

DESCRIPTION

Speaker: Sylvain Lebresne, Software Engineer at DataStax Video: http://www.youtube.com/watch?v=4GSfAS4nFAs&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=18 Since its inception, the Cassandra Query Language (CQL) has grown and matured, resulting in the 3rd version of the language (CQL3) being finalized in Cassandra 1.2 and further improved in Cassandra 2.0. Compared to the legacy Thrift API, CQL3 aims at providing an API that is higher level, more user friendly, but still fully assumes the distributed nature of Cassandra and it's storage engine. This talk will present CQL3, describing the reasoning and goals behind the language as well as the language itself. We will also touch on CQL's relationship with Thrift and will present the CQL binary protocol that has been introduced in Cassandra 1.2. We will wrap up by discussing the future of CQL.

TRANSCRIPT

Page 1: C* Summit EU 2013: The State of CQL

The State of CQL

Sylvain Lebresne (DataStax)

Page 2: C* Summit EU 2013: The State of CQL

A short CQL primer

New in Cassandra 2.0

Native protocol

What's next?

2/20

Page 3: C* Summit EU 2013: The State of CQL

A better API for CassandraThrift is not satisfactory:

Cassandra has often been regarded as hard to develop against.

It doesn't have to be that way!

Not user friendly, hard to use.

Low level, very little abstraction.

Hard to evolve (in a backward compatible way).

Unreadable without driver abstraction.

····

3/20

Page 4: C* Summit EU 2013: The State of CQL

Quick historical notesCQL1 first introduced in Cassandra 0.8, became CQL2 in Cassandra 1.0

"These aren't the CQL you are looking for"

CQL3 (CQL for short thereafter) introduced in Cassandra 1.2

Semantically, CQL1/CQL2 are closer to the Thrift API than to CQL3.

CQL3 is the version that's here to stay: no plan for a CQL4 any time soon.

·····

4/20

Page 5: C* Summit EU 2013: The State of CQL

A short CQL primer

Page 6: C* Summit EU 2013: The State of CQL

The Cassandra Query LanguageSyntactically, a subset of SQL (with a few extensions)

INSERT and UPDATE are both upserts

No joins, no sub-queries, no aggregation, ...

Denormalization is the norm: do the work at write time, not read time

·CREATE TABLE users ( user_id uuid, name text, password text, email text, picture_profile blob, PRIMARY KEY (user_id))

CQL

···

6/20

Page 7: C* Summit EU 2013: The State of CQL

Denormalization: Cassandra modeling 101Efficient queries in Cassandra are based on 2 principles:

Denormalization is the technique that allows to achieve this in practice.

But this means CQL exposes:

the data queried is collocated on one replica set

the data queried is collocated on disk on those replicas

··

how to collocate data on the same replica set

how to collocate data on disk (for a given replica)

··

7/20

Page 8: C* Summit EU 2013: The State of CQL

This is done in CQL through the primary key

CQL distinguishes 2 sub-parts in the PRIMARY KEY:

This is important, because CQL only allow queries for which an explicit indexexists:

CREATE TABLE inboxes ( user_id uuid, email_id timeuuid, sender text, recipients set<text>, subject text, is_read boolean, PRIMARY KEY (user_id, email_id))

CQL

partition key: decides the node on which the data is storedclustering columns: within the same partition key, (CQL3) rows arephysically ordered following the clustering columns

··

-- Get last 50 emails in user 51b-23-ab8 inboxSELECT * FROM inboxes WHERE user_id=51b-23-ab8 ORDER BY email_id DESC LIMIT 50;

CQL

8/20

Page 9: C* Summit EU 2013: The State of CQL

CQL main features

For more details:

Collections (set, map and list)

Secondary indexes

Convenience functions (timeuuid, type conversions, ...)

...

····

http://cassandra.apache.org/doc/cql3/CQL.html

http://www.datastax.com/documentation/cql/3.1/webhelp/index.html

··

9/20

Page 10: C* Summit EU 2013: The State of CQL

New in Cassandra 2.0

Page 11: C* Summit EU 2013: The State of CQL

New in Cassandra 2.0Lightweight transactions:

Triggers:

ALTER DROP:

Preparing TIMESTAMP, TTL and LIMIT:

INSERT INTO test (id, name) VALUES (42, 'Tom') IF NOT EXISTS;UPDATE test SET password='newpass' WHERE id=42 IF password='oldpass';

CQL

CREATE TRIGGER myTrigger ON test USING 'my.trigger.Class'; CQL

CREATE TABLE test (k int PRIMARY KEY, prop1 int, prop2 text, prop3 float);ALTER TABLE test DROP prop3;

CQL

SELECT * FROM myTable LIMIT ?;UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';

CQL

11/20

Page 12: C* Summit EU 2013: The State of CQL

New in Cassandra 2.0Conditional DDL:

Secondary indexes everywhere (almost):

SELECT aliases:

CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);DROP KEYSPACE IF EXISTS ks;

CQL

CREATE TABLE timeline ( event_id uuid, created_at timeuuid, content blob, PRIMARY KEY (event_id, created_at));CREATE INDEX ON timeline (created_at);

CQL

SELECT event_id, dateOf(created_at) AS creation_date, FROM timeline;

CQL

12/20

Page 13: C* Summit EU 2013: The State of CQL

Coming in Cassandra 2.0.2Named bind variables:

Prepared IN:

Limited SELECT DISTINCT:

SELECT * FROM timeline WHERE created_at > :tlow AND created_at <= :thigh AND key = :k;CQL

SELECT * FROM users WHERE user_id IN ?; CQL

CREATE TABLE test ( event_id int, created_at timestamp, content blob, PRIMARY KEY (event_id, created_at));SELECT DISTINCT event_id FROM test;

CQL

13/20

Page 14: C* Summit EU 2013: The State of CQL

The native protocolA binary transport protocol for CQL

Page 15: C* Summit EU 2013: The State of CQL

Native protocol

Example usage of the Java driver (https://github.com/datastax/java-driver):

Binary transport protocol for CQL

Query execution, prepared statements, authentication, compression, ...

Asynchronous (allows multiple concurrent queries per connection)

Server notifications (Only generic cluster events currently)

Existing drivers for Java, C#, Python, C++, Golang, ...

·····

Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();Session session = cluster.connect("myKeyspace");

for (Row row : session.execute("SELECT * FROM myTable")) // Do something ...

JAVA

15/20

Page 16: C* Summit EU 2013: The State of CQL

New in Cassandra 2.0: native protocol 2Cursors:

Batching prepared statements:

One-shot prepare and execute:

SASL for authentication

for (Row row : session.execute("SELECT * FROM myTable")) // Do something ...

JAVA

PreparedStatement ps = session.prepare("INSERT INTO myTable (p1, p1) VALUES (?, ?)");

BatchStatement bs = new BatchStatement();bs.add(ps.bind(0, "v1"));bs.add(ps.bind(1, "v2"));bs.add(ps.bind(2, "v3"));session.execute(bs);

JAVA

session.execute("INSERT INTO users (id, photo) VALUES (?, ?)", someId, photoBytes);JAVA

16/20

Page 17: C* Summit EU 2013: The State of CQL

What's next?Cassandra 2.1 and beyond

Page 18: C* Summit EU 2013: The State of CQL

CQL: some ideasStorage engine optimizations for CQL

Secondary index for collections

Server side functions

User defined types

...

·····

18/20

Page 19: C* Summit EU 2013: The State of CQL

User defined types

CREATE TYPE address ( street text, zip_code int, state text, phones set<text>);

CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, address>);

INSERT INTO users (id, name) VALUES (234-4a-761, "Sylvain Lebresne");UPDATE users SET addresses["work"] = { street: '777 Mariners Island Blvd #510', zip_code: 94404, state: 'CA', phones: { 650-389-6000 }} WHERE id = 234-4a-761;

CQL

19/20

Page 20: C* Summit EU 2013: The State of CQL

Thank You!(Questions?)