cassandra 1.1
DESCRIPTION
TRANSCRIPT
![Page 1: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/1.jpg)
©2012 DataStax
Apache Cassandra 1.1
Jonathan Ellis / @spyced
![Page 2: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/2.jpg)
©2012 DataStax
• CQL3
• Global row + key caches
• Fine-grained data storage control
• Row level isolation
• Concurrent schema changes
• Off-heap cache works on Windows
• "Write survey mode"
• Hadoop improvements
• Stress tool
New features in 1.1
![Page 3: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/3.jpg)
©2012 DataStax
Modern Cassandra, briefly• 0.7
• CREATE COLUMN FAMILY
• TTL
• Secondary (column) indexes
• 0.8• Counters
• Automatic memtable tuning
• 1.0• Compression
• Leveled compaction
![Page 4: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/4.jpg)
©2012 DataStax
Global row + key caches• cassandra.yaml
• key_cache_size_in_mb (default 2)
• row_cache_size_in_mb (default 0)
• Also save periods
• Per-CF: caching=ALL|KEYS_ONLY*|ROWS_ONLY|NONE
• Old CF-level options are ignored• row_cache_size, key_cache_size
• save periods
![Page 5: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/5.jpg)
©2012 DataStax
Data storage• Old:
• /var/lib/cassandra/data/Keyspace1/Standard1-hc-1-Data.db
• New:• /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-
Standard1-hc-1-Data.db
• (Includes KS in !lename for easier bulk loading)
![Page 6: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/6.jpg)
©2012 DataStax
Row-level isolation• Never see partial updates to a row
• We now have AID from ACID• C in ACID != C in CAP
![Page 7: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/7.jpg)
©2012 DataStax
Concurrent schema changes• Fixes http://wiki.apache.org/cassandra/
FAQ#schema_disagreement
• Can still have temporary disagreements if you use a new CF before all nodes have it
• Also speeds up adding new nodes
![Page 8: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/8.jpg)
©2012 DataStax
Off-heap cache on Windows• SerializingCacheProvider no longer requires JNA
• SCP is the default starting with 1.0, but falls back to CLHCP if JNA is not present in < 1.1
![Page 9: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/9.jpg)
©2012 DataStax
Write survey mode• bin/cassandra -Dcassandra.write_survey=true
• Allows experimenting w/ compaction, compression, new versions*• isolate node to test reads
![Page 10: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/10.jpg)
©2012 DataStax
Abortable compactions• nodetool stop <type>
![Page 11: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/11.jpg)
©2012 DataStax
• (CQL2 is still default)
• Composite PK support• .. slice syntax removed
• ORDER BY syntax conforms to SQL
CQL3
![Page 12: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/12.jpg)
©2012 DataStax
A simple exampleCREATE TABLE tweets ( tweet_id uuid PRIMARY KEY, author varchar, body varchar);
![Page 13: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/13.jpg)
©2012 DataStax
Tweets
tweet_id
1790
1787
1778
author body
gwashingtonTo be prepared for war is one of the most
effectual means of preserving peace
jmadison All men having power ought to be distrusted to a certain degree
gmason
Those gentlemen, who will be elected senators, will fix themselves in the federal
town, and become citizens of that town more than of your state
![Page 14: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/14.jpg)
©2012 DataStax
With clustering
CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id));
partition keyclustered
![Page 15: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/15.jpg)
©2012 DataStax
Timeline
user_id
jadams
jadams
ahamilton
ahamilton
tweet_id author body
1787 jmadison All men ...
1790 gwashington To be prepared ...
1778 gmason Those gentlemen ...
1790 gwashington To be prepared ...
clustered (within partition key)not
clustered
![Page 16: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/16.jpg)
©2012 DataStax
Timeline, physical layout
jadams
ahamilton
(1787, author): jmadison
(1787, body):All men ...
(1790, author): gwashington
(1790, body): To be prepared ...
(1778, author): gmason
(1778, body): Those gentlemen ...
(1790, author): gwashington
(1790, body): To be prepared ...
Non-PK columns contain string literal of column name
![Page 17: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/17.jpg)
©2012 DataStax
WITH COMPACT
CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id, author))WITH COMPACT STORAGE;
• For backwards compatibilityAll but one column
![Page 18: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/18.jpg)
©2012 DataStax
jadams
ahamilton
(1787, jmadison): All men ...
(1790, gwashington): To be prepared ...
(1778, gmason): Those gentlemen ...
(1790, gwashington): To be prepared ...
no “body” literal
![Page 19: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/19.jpg)
©2012 DataStax
Earlier changes• (1.0.6) Allow CF names to be quali"ed by keyspace for
INSERT, ALTER, DELETE, TRUNCATE• INSERT INTO ks.cf (...) VALUES (...)
• (SELECT was done in 1.0.1)
• (1.0.4) ALTER CF attributes
![Page 20: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/20.jpg)
©2012 DataStax
cqlsh• SOURCE and CAPTURE commands
• (1.0.8) DESCRIBE COLUMNFAMILIES
![Page 21: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/21.jpg)
©2012 DataStax
The future is CQL (based)• cqlsh
• performance• prepared statements
• netty-based transport (CASSANDRA-2478)
• What does this mean for pycassa, Hector, et al?
![Page 22: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/22.jpg)
©2012 DataStax
• 2I support*
• Wide row support*
• BulkOutputFormat
• (*Covered in updated WordCount)
Hadoop Integration
![Page 23: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/23.jpg)
©2012 DataStax
Secondary Index supportIndexExpression expr = new IndexExpression( ByteBufferUtil.bytes("int4"), IndexOperator.EQ, ByteBufferUtil.bytes(0));
ConfigHelper.setInputRange( job.getConfiguration(),
![Page 24: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/24.jpg)
©2012 DataStax
Wide row supportConfigHelper.setInputColumnFamily( job.getConfiguration(), KEYSPACE, COLUMN_FAMILY, true);
Also: PIG_WIDEROW_INPUT
![Page 25: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/25.jpg)
©2012 DataStax
BulkOutputFormatjob.setOutputFormatClass( BulkOutputFormat.class);
• Compatible w/ CFOF + extra options
• OUTPUT_LOCATION
• BUFFER_SIZE_IN_MB
• STREAM_THROTTLE_MBITS
• (system default, 64, unlimited)
• Limitation: can’t stream to dead nodes ("x in 1.1.1?)
![Page 26: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/26.jpg)
©2012 DataStax
Stress tool• tools/bin/stress*
• Insert, read, seq scan, indexed scan, multiget, counter add/get
• CQL
![Page 27: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/27.jpg)
©2012 DataStax
Bonus: What’s new in C* 1.1.1• Incremental repair by token range
• Support for commitlog archiving and PITR
• Identify and blacklist corrupted SSTables from future compactions
• Open 1 sstableScanner per level for leveled compaction
• More CQL3 improvements (e.g. reversed clustering)
• "x re-creating Keyspaces/ColumnFamilies with the same name as dropped ones
![Page 28: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/28.jpg)
©2012 DataStax
DataStax Community, with OpsCenter
![Page 29: Cassandra 1.1](https://reader036.vdocument.in/reader036/viewer/2022082309/54b6f45c4a7959d0658b45bc/html5/thumbnails/29.jpg)