experience cassandra wenjing wu 2011-5-17. outline about cassandra data model deployment client...

23
Experience Cassandra Wenjing wu 2011-5-17

Upload: candice-spencer

Post on 01-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Experience Cassandra

Wenjing wu2011-5-17

outline

• About Cassandra• Data Model• Deployment• Client Programming• An example: implementing a name space • Stress tests

What is Cassandra(1)

• Decentralized/fault tolerant/scalable /durable distributed hash storage

• Originally developed by facebook, now maintained by apache.

• A list of big users: cloudkick, digg, Facebook, twitter, Rackspace, Cisco etc.

• A combination of Big Table and Dynamo• Like a big hash table(both 2 and 3 dimensional )

What is Cassandra(2)

• Eventual consistence• CAP theory: AP, however, configurable

tradeoffs between A and C. • Easy to deploy• Rich client APIs for your own application, easy

to install/use

Data model(1)

• Non SQL• Support single index for query– Select username from user where

city=‘beijing’(Yes)– -select username from user where city=‘beijing’

and age=‘28’ (No!)• No joins , no complicated query• Useful for suitable cases

Data model(2)• Keyspace , one for each application, equivalent to a

database• Column: an attribute of the structured data, has a name,

value and timestamp, equivalent to column of a table. (column=username, value=tom, timestamp=1299137043078874)

• Column family: a serial columns as above ones. Define a column family User:– (column=username, value=tom,

timestamp=1299137043078874)– (column=email, [email protected],

timestamp=1299137043078133)– (column=city, value=beijing, timestamp=1299137043078141)

Data Model(3)

• A row : identified by a key, instantiated one or more of the columns in column family:– RowKey: userkey1– (column=username, value=tom, timestamp=1299137043078874)– (column=email, [email protected],

timestamp=1299137043078133)• Application creates the key(unique, usually use uuid to avoid

collision) for each row, each row can have different number of columns within the column family

• Analogous to 2 dimensional hash tableUser{row_key1}{username}=tomUser{row_key1}{email}[email protected]

Data Model(4)

• Supper column family– Each column of the super column family is a

column family

• 3 dimensional hash table– Person{row_key1}{user}{user_name}=tom– Person{row_key1}{user}{email}[email protected]– Person{row_key2}{manager}{user_name}=Alice

Deployment(1)

• Pretty easy!– Wget

http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.7.5/apache-cassandra-0.7.5-bin.tar.gz

– tar zxvf apache-cassandra-0.7.5-bin.tar.gz– cd apache-cassandra-0.7.5– udo mkdir -p /var/log/cassandra– sudo chown -R `whoami` /var/log/cassandra– sudo mkdir -p /var/lib/cassandra– sudo chown -R `whoami` /var/lib/cassandra

Deployment(2)• Start service

– bin/cassandra –f• Try to connect with client:

– bin/cassandra-cli --host localhost –port 9160• How to start:

– create keyspace Keyspace1– create column family Users with comparator=UTF8Type and

default_validation_class=UTF8Type;– set Users[jsmith][first] = 'John';– set Users[jsmith][last] = 'Smith';

• What you see?– [default@KS1] get Users[jsmith];– => (column=last, value=Smith, timestamp=1287604215498000)– => (column=first, value=John, timestamp=1287604214111000)

Run over a cluster

• Configuration file– conf/cassandra.yaml – listen_address: fst01.ihep.ac.cn(for gossip)– rpc_address: fst01.ihep.ac.cn(for client)– seeds: - fst02.ihep.ac.cn - fst03.ihep.ac.cn - fst04.ihep.ac.cn

• Test the cluster– bin/nodetool –host fst01.ihep.ac.cn ring

Client Programming

• Rich client options (c/java/php/perl/python)• Driver for python client(pycassa)• Easy to install– Install by easy_install

• Have easy_install installed• $easy_install pycassa

– Manual install• $ Easy_install thrift05• $ git clone git://github.com/pycassa/pycassa.git• $ cd pycassa/• $ sudo python setup.py install

API examples

• >>> import pycassa• >>> pool = pycassa.connect('Keyspace1',

['localhost:9160'])• col_fam = pycassa.ColumnFamily(pool, ’User')• col_fam.insert(’user_key1', {’username':

’tom'})• col_fam.get(’user_key1')• col_family.remove(‘user_key1’)

An example: implement a namespace

• Use pycassa to implement a name space.• Similar to ext3 file system, inodes to represent

metadata• 2 column family used (Directory, FFile) to

describe the metadata• CF Directory, columns include :– Metadata: create/modify/access time, owner,group– Contents inside the directory: sub directories

names, file names

Directory(1)

dir_key1

Owner filestore

Group filestore

testdir1 dir_keyxxxxx1

testdir2 dir_keyxxxxx2

testfile1 file_keyyyyyy1

Directory(2)

• A row :– RowKey: dirkey_372c5d87-4567-11e0-bc71-001a64631cb0– => (column=dir3, value=3e180f00-459b-11e0-8846-

001a64631cb0, timestamp=1299159388519845)– => (column=f2, value=c69f2ac2-45a6-11e0-9c79-001a64631cb0,

timestamp=1299329058698329)– => (column=f3, value=ddd77c2e-45a5-11e0-934f-001a64631cb0,

timestamp=1299328989534849)– => (column=group, value=root, timestamp=1299137043078874)– => (column=owner, value=root, timestamp=1299137043078874)– => (column=p3, value=edf0ed73-45a6-11e0-bf90-

001a64631cb0, timestamp=1299164408007020)

FFile(1)

• CF FFile is used to store the metadata and contents of a specific file

• FFile columns include:– Metadata: create/modify/access time,

owner,group,size, checksum– Contents of the file

FFile(2)file_key_yyyy1

owner filestore

group filestore

size 1023

content Bla bla….

Ffile(3)

• A row– RowKey: filekey_edf0ed73-45a6-11e0-bf90-001a64631cb0– => (column=content, value=– 127.0.0.1 localhost.localdomain localhost– 202.122.33.12 lcg002.ihep.ac.cn lcg002– 192.168.56.11 lwn011.ihep.ac.cn lwn011– ....,timestamp=1299164408007882)– => (column=group, value=root, timestamp=1299164408007882)– => (column=owner, value=root, timestamp=1299164408007882)– => (column=size, value=11281, timestamp=1299164408007882)

Name space operation

• fs_ls (list a dir/file)• fs_mkdir(make a dir)• fs_rename (rename a file/dir)• fs_mv(move a file/dir to another file/dir)• fs_rm (remove a file/dir)• fs_cpw(write a file to the storage)• fs_cpr(read a file from the storage)

How does it workdir_key1

owner filestore

group filestore

testdir1 dir_keyxxx1

testdir2 dir_keyxxx2

dir_keyxxx1

owner filestore

testdir12 dir_keyxxx4

testfile11 file_keyyyy1

testfile12 File_keyyyy2

file_keyyyy1

owner filestore

group filestore

size 1023

content This is a test file….

/testdir1/testfile11

How to implement?

• mk_dir: fs_mkdir /testdir1/testdir2/testdir3 (/testdir1/testdir2 already exisits)– 1. generate a key for this entry: new_key=dirkey_`uuid`– 2. walk from the root directory(/, key is dirkey_1) to get

the key for the parent directory(testdir2), assuming the key is dirkey_XXX

– 3.insert a column in the parent directory entry (testdir2, with key dirkey_XXX). the column name is the name of the inserting directory(testdir3), and its value is the new_key

– 4. create a new entry for the new directory, with all the metadata columns (owner, group)

Stress test

• Testbed: A small cluster– 4 nodes cluster– Replica number is 3– One client

• test methodology:– Operation sequence:

• mkdir/touch a file/list dir & file

– Depth of directory(4) /dir1/dir2/dir3/dir4– -test result: finished 255102 operation(mkdir,create file,list dir, list

file) in 111397.302446 seconds, 0..436second for each operation sequence

– Another test failed (more than 10million operation) due to memory crash.