cs 245notes 131 cs 245: database system principles notes 13: bigtable, hbase, cassandra hector...

24
CS 245 Notes 13 1 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Upload: irma-jennings

Post on 18-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

Lots of Buzz Words! “Apache Cassandra is an open-source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tunably consistent, column-oriented database that bases its distribution design on Amazon’s dynamo and its data model on Google’s Big Table.” Clearly, it is buzz-word compliant!! CS 245Notes 133

TRANSCRIPT

Page 1: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

CS 245 Notes 13 1

CS 245: Database System Principles

Notes 13:BigTable, HBASE, Cassandra

Hector Garcia-Molina

Page 2: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Sources• HBASE: The Definitive Guide, Lars George,

O’Reilly Publishers, 2011.• Cassandra: The Definitive Guide, Eben

Hewitt, O’Reilly Publishers, 2011.• BigTable: A Distributed Storage System for

Structured Data, F. Chang et al, ACM Transactions on Computer Systems, Vol. 26, No. 2, June 2008.

CS 245 Notes 13 2

Page 3: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Lots of Buzz Words!• “Apache Cassandra is an open-source,

distributed, decentralized, elastically scalable, highly available, fault-tolerant, tunably consistent, column-oriented database that bases its distribution design on Amazon’s dynamo and its data model on Google’s Big Table.”

• Clearly, it is buzz-word compliant!!

CS 245 Notes 13 3

Page 4: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Basic Idea: Key-Value Store

CS 245 Notes 13 4

key valuek1 v1k2 v2k3 v3k4 v4

Table T:

Page 5: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Basic Idea: Key-Value Store

CS 245 Notes 13 5

key valuek1 v1k2 v2k3 v3k4 v4

Table T:

keys are sorted

• API:– lookup(key) value– lookup(key range) values– getNext value– insert(key, value)– delete(key)

• Each row has timestemp• Single row actions atomic

(but not persistent in some systems?)• No multi-key transactions• No query language!

Page 6: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Fragmentation (Sharding)

CS 245 Notes 13 6

key valuek1 v1k2 v2k3 v3k4 v4k5 v5k6 v6k7 v7k8 v8k9 v9k10 v10

key valuek1 v1k2 v2k3 v3k4 v4

key valuek5 v5k6 v6

key valuek7 v7k8 v8k9 v9k10 v10

server 1 server 2 server 3

• use a partition vector• “auto-sharding”: vector selected automatically

tablet

Page 7: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Tablet Replication

CS 245 Notes 13 7

key valuek7 v7k8 v8k9 v9k10 v10

server 3 server 4 server 5

key valuek7 v7k8 v8k9 v9k10 v10

key valuek7 v7k8 v8k9 v9k10 v10

primary backup backup

• Cassandra:Replication Factor (# copies)R/W Rule: One, Quorum, AllPolicy (e.g., Rack Unaware, Rack Aware, ...)Read all copies (return fastest reply, do repairs if necessary)

• HBase: Does not manage replication, relies on HDFS

Page 8: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Need a “directory”• Table Name: Key Server that stores key

Backup servers• Can be implemented as a special table.

CS 245 Notes 13 8

Page 9: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Tablet Internals

CS 245 Notes 13 9

key valuek3 v3k8 v8k9 deletek15 v15

key valuek2 v2k6 v6k9 v9k12 v12

key valuek4 v4k5 deletek10 v10k20 v20k22 v22

memory

disk

Design Philosophy (?): Primary scenario is where all data is in memory.Disk storage added as an afterthought

Page 10: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Tablet Internals

CS 245 Notes 13 10

key valuek3 v3k8 v8k9 deletek15 v15

key valuek2 v2k6 v6k9 v9k12 v12

key valuek4 v4k5 deletek10 v10k20 v20k22 v22

memory

disk

flush periodically

• tablet is merge of all segments (files)• disk segments imutable• writes efficient; reads only efficient when all data in memory• periodically reorganize into single segment

tombstone

Page 11: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Column Family

CS 245 Notes 13 11

K A B C D Ek1 a1 b1 c1 d1 e1k2 a2 null c2 d2 e2k3 null null null d3 e3k4 a4 b4 c4 e4 e4k5 a5 b5 null null null

Page 12: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Column Family

CS 245 Notes 13 12

K A B C D Ek1 a1 b1 c1 d1 e1k2 a2 null c2 d2 e2k3 null null null d3 e3k4 a4 b4 c4 e4 e4k5 a5 b5 null null null

• for storage, treat each row as a single “super value”• API provides access to sub-values

(use family:qualifier to refer to sub-values e.g., price:euros, price:dollars )

• Cassandra allows “super-column”: two level nesting of columns (e.g., Column A can have sub-columns X & Y )

Page 13: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Vertical Partitions

CS 245 Notes 13 13

K A B C D Ek1 a1 b1 c1 d1 e1k2 a2 null c2 d2 e2k3 null null null d3 e3k4 a4 b4 c4 e4 e4k5 a5 b5 null null null

K Ak1 a1k2 a2k4 a4k5 a5

K Bk1 b1k4 b4k5 b5

K Ck1 c1k2 c2k4 c4

K D Ek1 d1 e1k2 d2 e2k3 d3 e3k4 e4 e4

can be manually implemented as

Page 14: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Vertical Partitions

CS 245 Notes 13 14

K A B C D Ek1 a1 b1 c1 d1 e1k2 a2 null c2 d2 e2k3 null null null d3 e3k4 a4 b4 c4 e4 e4k5 a5 b5 null null null

K Ak1 a1k2 a2k4 a4k5 a5

K Bk1 b1k4 b4k5 b5

K Ck1 c1k2 c2k4 c4

K D Ek1 d1 e1k2 d2 e2k3 d3 e3k4 e4 e4

column family

• good for sparse data;• good for column scans• not so good for tuple reads• are atomic updates to row still supported?• API supports actions on full table; mapped to actions on column tables• API supports column “project”• To decide on vertical partition, need to know access patterns

Page 15: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Failure Recovery (BigTable, HBase)

CS 245 Notes 13 15

tablet servermemory

logGFS or HFS

master node sparetablet server

write ahead logging

ping

Page 16: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Failure recovery (Cassandra)• No master node, all nodes in “cluster” equal

CS 245 Notes 13 16

server 1 server 2 server 3

Page 17: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Failure recovery (Cassandra)• No master node, all nodes in “cluster” equal

CS 245 Notes 13 17

server 1 server 2 server 3

access any table in clusterat any server

that server sends requeststo other servers

Page 18: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Bonus Slides*:Are Traditional Databases Dead?• Heard on Twitter:

– noSQL rules– new DB systems scale better than old ones– DBMS too slow ...

• Therefore, need new, revolutionary technology!!

* WARNING: Author may be biased :-)

Page 19: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Cautionary Tale• Lawrence Richard Walters,

nicknamed "Lawnchair Larry" or the "Lawn Chair Pilot", (April 19, 1949 – October 6, 1993) was an American truck driver who took flight on July 2, 1982 in a homemade aircraft. Dubbed Inspiration I, the "flying machine" consisted of an ordinary patio chair with 45 helium-filled weather balloons attached to it. Walters rose to an altitude of over 15,000 feet (4,600 m) and floated from his point of origin in San Pedro, California into controlled airspace near Los Angeles International Airport.

Page 20: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Parallels• Lawnchair Larry

– Wanna fly– Can’t afford

airplane

• T-Gen– Wanna DB services– Can’t afford real

DBMS

Page 21: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Parallels• Lawnchair Larry

– Wanna fly– Can’t afford

airplane– I can do myself!– I am off!!

• T-Gen– Wanna DB services– Can’t afford real

DBMS– I can do myself!– I am off!!

Page 22: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Parallels• Lawnchair Larry

– Wanna fly– Can’t afford

airplane– I can do myself!– I am off!!– How do I land???

• T-Gen– Wanna DB services– Can’t afford real

DBMS– I can do myself!– I am off!!– I need joins???

Page 23: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Parallels• Lawnchair Larry

– Wanna fly– Can’t afford

airplane– I can do myself!– I am off!!– How do I land???– How talk ATC?– How to navigate?– Need oxygen!!??

• T-Gen– Wanna DB services– Can’t afford real

DBMS– I can do myself!– I am off!!– I need joins???– How to index?– Just had crash! Now

what?– Data inconsistent!– Oh? Need to

maintain???

Page 24: CS 245Notes 131 CS 245: Database System Principles Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina

Keep Lawnchair Larry in Mind• Does DBMS technology

not cut it and we need tostart from scratch??– Or are you just being

cheap? • If you think you need a

subset of DBMS, will needs change over time?