hypertable an open source, high performance, scalable database
TRANSCRIPT
![Page 1: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/1.jpg)
HypertableHypertableDoug JuddDoug Judd
Zvents, Inc.Zvents, Inc.
![Page 2: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/2.jpg)
hypertable.orghypertable.org
BackgroundBackground
![Page 3: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/3.jpg)
Web 2.0 = Data ExplosionWeb 2.0 = Data Explosion
Web 1.0 Web 2.0
Web 1.0Web 2.0
![Page 4: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/4.jpg)
Traditional ToolsTraditional ToolsDon’t Scale WellDon’t Scale Well
Designed for a single machineDesigned for a single machine Typical scaling solutionsTypical scaling solutions
ad-hocad-hoc manual/static resource allocationmanual/static resource allocation
![Page 5: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/5.jpg)
hypertable.orghypertable.org
The Google StackThe Google Stack
Google File System (GFS)Google File System (GFS) Map-reduceMap-reduce BigtableBigtable
![Page 6: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/6.jpg)
hypertable.orghypertable.org
Architectural OverviewArchitectural Overview
![Page 7: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/7.jpg)
hypertable.orghypertable.org
What is Hypertable?What is Hypertable?
A open source high performance, scalable A open source high performance, scalable database, modelled after Google's Bigtabledatabase, modelled after Google's Bigtable
Not relationalNot relational Does not support transactionsDoes not support transactions
![Page 8: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/8.jpg)
Hypertable Improvements Hypertable Improvements Over Traditional RDBMSOver Traditional RDBMS
Scalable Scalable High High randomrandom insert, update, and delete insert, update, and delete
raterate
![Page 9: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/9.jpg)
hypertable.orghypertable.org
Data ModelData Model
Sparse, two-dimensional table with cell versionsSparse, two-dimensional table with cell versions Cells are identified by a 4-part keyCells are identified by a 4-part key
RowRow Column FamilyColumn Family Column QualifierColumn Qualifier TimestampTimestamp
![Page 10: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/10.jpg)
hypertable.orghypertable.org
Table: Visual RepresentationTable: Visual Representation
![Page 11: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/11.jpg)
hypertable.orghypertable.org
Table: Actual RepresentationTable: Actual Representation
![Page 12: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/12.jpg)
hypertable.orghypertable.org
Anatomy of a KeyAnatomy of a Key
Row key is \0 terminatedRow key is \0 terminated Column Family is represented with 1 byteColumn Family is represented with 1 byte Column qualifier is \0 terminatedColumn qualifier is \0 terminated Timestamp is stored big-endian ones-complimentTimestamp is stored big-endian ones-compliment
![Page 13: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/13.jpg)
ConcurrencyConcurrency
Bigtable uses copy-on-writeBigtable uses copy-on-write Hypertable uses a form of MVCCHypertable uses a form of MVCC
(multi-version concurrency control)(multi-version concurrency control) Deletes are carried out by inserting “delete” Deletes are carried out by inserting “delete”
records records
![Page 14: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/14.jpg)
CellStoreCellStore
Sequence of 65K Sequence of 65K blocks of compressed blocks of compressed key/value pairskey/value pairs
![Page 15: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/15.jpg)
System OverviewSystem Overview
![Page 16: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/16.jpg)
hypertable.orghypertable.org
Range ServerRange Server
Manages ranges of table dataManages ranges of table data Caches updates in memory (CellCache)Caches updates in memory (CellCache) Periodically spills (compacts) cached updates to disk Periodically spills (compacts) cached updates to disk
(CellStore)(CellStore)
![Page 17: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/17.jpg)
Client APIClient APIclass Client {
void create_table(const String &name, const String &schema);
Table *open_table(const String &name);
String get_schema(const String &name);
void get_tables(vector<String> &tables);
void drop_table(const String &name, bool if_exists);};
![Page 18: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/18.jpg)
hypertable.orghypertable.org
Client API (cont.)Client API (cont.)class Table {
TableMutator *create_mutator();
TableScanner *create_scanner(ScanSpec &scan_spec);
};
class TableMutator {
void set(KeySpec &key, const void *value, int value_len);
void set_delete(KeySpec &key);
void flush();
};
class TableScanner {
bool next(CellT &cell);
};
![Page 19: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/19.jpg)
Language BindingsLanguage Bindings
Currently C++ onlyCurrently C++ only Thrift BrokerThrift Broker
![Page 20: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/20.jpg)
Write Ahead Commit LogWrite Ahead Commit Log
Persists all modifications (inserts and Persists all modifications (inserts and deletes)deletes)
Written into underlying DFSWritten into underlying DFS
![Page 21: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/21.jpg)
Range Meta-Operation LogRange Meta-Operation Log
Facilitates Range meta operationFacilitates Range meta operation LoadsLoads SplitsSplits MovesMoves
Part of Master and RangeServerPart of Master and RangeServer Ensures Range state and location Ensures Range state and location
consistencyconsistency
![Page 22: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/22.jpg)
hypertable.orghypertable.org
CompressionCompression
Cell Stores store compressed blocks of key/value Cell Stores store compressed blocks of key/value pairspairs
Commit Log stores compressed blocks of Commit Log stores compressed blocks of updatesupdates
Supported Compression SchemesSupported Compression Schemes zlib (--best and --fast)zlib (--best and --fast) lzolzo quicklzquicklz bmzbmz nonenone
![Page 23: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/23.jpg)
hypertable.orghypertable.org
CachingCaching
Block CacheBlock Cache Caches CellStore blocksCaches CellStore blocks Blocks are cached uncompressedBlocks are cached uncompressed
Query CacheQuery Cache Caches query resultsCaches query results TBDTBD
![Page 24: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/24.jpg)
Bloom FilterBloom Filter
Negative CacheNegative Cache Probabilistic data structureProbabilistic data structure Indicates if key is Indicates if key is notnot present present
![Page 25: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/25.jpg)
hypertable.orghypertable.org
Scaling (part I)Scaling (part I)
![Page 26: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/26.jpg)
hypertable.orghypertable.org
Scaling (part II)Scaling (part II)
![Page 27: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/27.jpg)
hypertable.orghypertable.org
Scaling (part III)Scaling (part III)
![Page 28: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/28.jpg)
hypertable.orghypertable.org
Access GroupsAccess Groups
Provides control of physical data layout -- Provides control of physical data layout -- hybrid row/column orientedhybrid row/column oriented
Improves performance by minimizing I/OImproves performance by minimizing I/O
CREATE TABLE crawldb {CREATE TABLE crawldb { Title MAX_VERSIONS=3, Title MAX_VERSIONS=3, Content MAX_VERSIONS=3, Content MAX_VERSIONS=3, PageRank MAX_VERSIONS=10, PageRank MAX_VERSIONS=10, ClickRank MAX_VERSIONS=10, ClickRank MAX_VERSIONS=10, ACCESS GROUP default (Title, Content), ACCESS GROUP default (Title, Content), ACCESS GROUP ranking (PageRank, ClickRank) ACCESS GROUP ranking (PageRank, ClickRank)}; };
![Page 29: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/29.jpg)
hypertable.orghypertable.org
Filesystem Broker Filesystem Broker ArchitectureArchitecture
Hypertable can run on top of any distributed Hypertable can run on top of any distributed filesystem (e.g. Hadoop, KFS, etc.)filesystem (e.g. Hadoop, KFS, etc.)
![Page 30: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/30.jpg)
Keys To PerformanceKeys To Performance
C++C++ Asynchronous communicationAsynchronous communication
![Page 31: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/31.jpg)
hypertable.orghypertable.org
C++ vs. JavaC++ vs. Java
Hypertable is CPU intensiveHypertable is CPU intensive Manages large in-memory key/value mapManages large in-memory key/value map Alternate compression codecs (e.g. BMZ)Alternate compression codecs (e.g. BMZ)
Hypertable is memory intensiveHypertable is memory intensive Java uses 2-3 times the amount of memory to Java uses 2-3 times the amount of memory to
manage large in-memory map (e.g. TreeMap)manage large in-memory map (e.g. TreeMap) Poor processor cache performancePoor processor cache performance
![Page 32: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/32.jpg)
hypertable.orghypertable.org
Performance TestPerformance Test(AOL Query Logs)(AOL Query Logs)
75,274,825 inserted cells75,274,825 inserted cells 8 node cluster8 node cluster
1 1.8 GHz Dual-core Opteron1 1.8 GHz Dual-core Opteron 4 GB RAM4 GB RAM 3 x 7200 RPM SATA drives3 x 7200 RPM SATA drives
Average row key: 7 bytesAverage row key: 7 bytes Average value: 15 bytesAverage value: 15 bytes Replication factor: 3Replication factor: 3 4 simultaneous insert clients4 simultaneous insert clients 500K 500K randomrandom inserts/s inserts/s 680K scanned cells/s680K scanned cells/s
![Page 33: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/33.jpg)
hypertable.orghypertable.org
Performance Test IIPerformance Test II Simulated AOL query log dataSimulated AOL query log data 1TB data1TB data 9 node cluster9 node cluster
1 2.33 GHz quad-core Intel1 2.33 GHz quad-core Intel 16 GB RAM16 GB RAM 3 x 7200 RPM SATA drives3 x 7200 RPM SATA drives
Average row key: 9 bytesAverage row key: 9 bytes Average value: 18 bytesAverage value: 18 bytes Replication factor: 3Replication factor: 3 4 simultaneous insert clients4 simultaneous insert clients Over 1M Over 1M randomrandom inserts/s (sustained) inserts/s (sustained)
![Page 34: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/34.jpg)
hypertable.orghypertable.org
WeaknessesWeaknesses
Range data managed by a single range Range data managed by a single range serverserver Though no data loss, can cause periods of Though no data loss, can cause periods of
unavailabilityunavailability Can be mitigated with client-side cache or Can be mitigated with client-side cache or
memcachedmemcached
![Page 35: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/35.jpg)
hypertable.orghypertable.org
Project StatusProject Status
Currently in “alpha”Currently in “alpha” Just released version 0.9.0.7Just released version 0.9.0.7
Will release “beta” version end of AugustWill release “beta” version end of August Waiting on Hadoop JIRA 1700Waiting on Hadoop JIRA 1700
![Page 36: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/36.jpg)
LicenseLicense
GPL 2.0GPL 2.0 Why not Apache?Why not Apache?
![Page 37: Hypertable An Open Source, High Performance, Scalable Database](https://reader036.vdocument.in/reader036/viewer/2022062404/552957004a795972158b46c6/html5/thumbnails/37.jpg)
hypertable.orghypertable.org
Questions?Questions?
www.hypertable.orgwww.hypertable.org