cassandra prophecy

Post on 26-Jan-2015

118 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

introduction to Apache Cassandra distributed database

TRANSCRIPT

CassandraProphecy

E-mail: khotin@gmx.comIgor Khotin

Background● 11+ years in the IT industry● 6+ years with Java● Flexible design promoter● Agile-junkie

highly scalable, eventually consistent, distributed, structured key-value store

Decentralized

● P2P● No SPOF● No network bottlenecks

Fault Tolerant

● High Availability● Replication and redundancy● Node replacement & no downtime● Multiple racks & datacenters

Elastic Scalability

● Scales up and down● Just add or remove nodes● Linear scalability● Low maintenance cost

Tunable consistency

● Different consistency levels● Consistency vs. latency

Rich Data Model

● Goes beyond simple key-value● Values could be indexed● Flexible schema

Scale up problem

Sharding doesn't solve it

Google File System & Google BigTable

Amazon Dynamo

Cassandraby Avinash Lakshman and Prashant Malik

Cassandraused in Inbox Search

Open sourced in July 2008

March 2009Accepted to Apache Incubator

February 2010Top-Level Apache Project

late 2010...Cassandra abandoned

Messaging moved to HBase

October 2011Release 1.0

November 30, 2011Release 1.0.5

(current stable)

Moving forward fast...

Brewer's CAP Theorem

Data Model

Column Family

Column sorting● ASCII● UTF8● Bytes● Long● LexicalUUID● TimeUUID● Custom

Design decision

Denormalization

Denormalization

Design for queries

Keyring

Keyring

Keyring

Keyring

Keyring

Keyring

Keyring

Keyring

Gossip

Optimized for writes

Optimized for writes● No reads● No seeks● No b-trees● Fast● Row - atomic

Tunable Consistency

Tombstone

Low Level Clients● Thrift

● IDL and binary communication protocol● Multiple languages support● Really sucks

● Avro● Better than Thrift, but sucks anyway

High Level Clients● Feature-rich

● Connection pool● Load-balancing● Fail-over

● Hector, Pelops... (Java)● Pycassa... (Python)● Fauna (Ruby)● ...

CQL● SQL for NoSQL

● CREATE KEYSPACE, CREATE COLUMNFAMILY, CREATE INDEX

● USE, SELECT, UPDATE, DELETE...

SELECT population FROM cityWHERE KEY = 'Paris'USING CONSISTENCY QUORUM

Understand your problem

Understand your problem

Find appropriate solution

Don't let default solutions to be imposed on you

Hard to choose?

Leaders will emerge

Resources● http://cassandra.apache.org● Dynamo: Amazon’s Highly Available Key-value Store

● Cassandra - A Decentralized Structured Storage System

● Bigtable: A Distributed Storage System for Structured Data

Contacts

E-mail: khotin@gmx.comBlog: www.ikhotin.comTwitter: chaostarter

Questions?

top related