big data hbase
TRANSCRIPT
HBASE
Agenda
• Introduction
• Hbase vs RDBMS
• Hbase vs HDFS
• Hbase Architecture
• Hbase with Hive
• Hbase with Java
• Hbase with Mapreduce
Introduction to HBaseHBase is a Nosql, non-relational, distributed column-oriented database on top ofHadoop.
NoSQL - NoSQL database are databases that doesn't use SQL engine as query engine.
Hbase DaemonsDaemons are services that run on individual machines and communicate with each other HMaster — Master server of HBase, contains all meta data.
HRegionserver — Slave server of Hbase, contains the actual data. HQuorumpeer — Zookeeper daemons for co-ordination service.
Advantages of using HBaseProvides a highly scalable database with nativity with hadoop.Nodes can be added on the fly.
HBase vs RDBMSRelational Database
•Is Based on a Fixed Schema• Is a Row-oriented datastore•Is designed to store Normalized Data•Contains thin tables•Has no built-in support for partitioning.
HBase
•Is Schema-less•Is a Column-oriented datastore•Is designed to store Denormalized Data•Contains wide and sparsely populated tables•Supports Automatic Partitioning
HBase vs HDFS
HDFS •Is suited for High Latency operations batch processing•Data is primarily accessed through MapReduce•Is designed for batch processing and hence doesn’t have a concept of random reads/writes
HBase •Is built for Low Latency operations•Provides access to single rows from billions of records•Data is accessed through shell commands, Client APIs in Java, REST, Avro or Thrift
RDBMS(B+ Tree)
RDBMS(B+ Tree)
•RDBMS adopts B+ tree to organize its indexes, as shown in figure.
• These B+ trees are often 3-level n-way balance trees. The nodes of a B+ tree are blocks on disk. So for a update by RDBMS, it likely needs 5 times disk operation. (3 times for B+ tree to find the block of the target row, 1 time for target block read, and 1 time for data update).
•On RDBMS, data is written randomly as heap file on disk, but random data block decrease read performance.
That’s why we need B+ tree index. B+ tree is fit well for data read, but is not efficient for data updates. Given the large distributed data, B+ tree is not the competitor for LSM-trees so far(used in Hbase)
HBase ( LSM Tree)
HBase ( LSM Tree)
LSM-trees can be viewed as n-level merge-trees. It transforms random writes into sequential writes using logfile and in-memory store.
Data Write(Insert, update): Data is written to logfile sequentially first, then to in-memory store, where data is organized as sorted tree, like B+ tree. When the in-memory store is filled up, the tree in the memory will be flushed to a store file on disk. The store files on disk is arranged like B+ tree . But store files are optimized for sequential disk access.
Data Read: In-memory store is searched first. Then search the store files on disk.
Data Delete: Give a data record a “delete marker”, system background will do housekeeping work by merging some store files into a larger one to reduce disk seeks. A data record will be deleted permanently during the housekeeping.LSM-trees’ data updates are operated in memory, no disk access, it’s faster than B+ tree. When the data read is always on the data set that is written recently, LSM-trees will reduce disk seeks, and improve performance. When disk IO is the cost we must consider, LSM-trees is more suitable than B+ tree.
Normalization vs Denormalization
RDBMS Data Model
HBase Data Model
HBase Data Model
Tables – The HBase Tables are more like logical collection of rows stored in separate partitions called Regions.
Rows – A row is one instance of data in a table and is identified by a rowkey. Rowkeys are unique in a Table and are always treated as a byte[].
Column Families – Data in a row are grouped together as Column Families. Each Column Family has one more Columns and these Columns in a family are stored together in a low level storage file known as Hfile The table above shows Customer and Sales Column Families. The Customer Column Family is made up 2 columns – Name and City, whereas the Sales Column Families is made up to 2 columns – Product and Amount.
HBase Data Model
Columns – A Column Family is made of one or more columns. A Column is identified by a Column Qualifier that consists of the Column Family name concatenated with the Column name using a colon – example: columnfamily:columnname. There can be multiple Columns within a Column Family and Rows within a table can have varied number of Columns.
Cell – A Cell stores data and is essentially a unique combination of rowkey, Column Family and the Column (Column Qualifier). The data stored in a Cell is called its value and the data type is always treated as byte[].
Version – The data stored in a cell is versioned and versions of data are identified by the timestamp. The number of versions of data retained in a column family is configurable and this value by default is 3.
HBase Physical Architecture
.
HBase Physical Architecture
.
HMaster is the master in such style, which is responsible for RegionServer monitor, region assignment, metadata operations, RegionServer Failover etc. In a distributed cluster, HMaster runs on HDFS NameNode.
RegionServer is the slave, which is responsible for serving and managing regions. In a distributed cluster, it runs on HDFS DataNode.
Zookeeper will track the status of Region Server, where the root table is hosted. Since HBase 0.90.x, it introduces an even more tighter integration with Zookeeper. The heartbeat report from Region Server to HMaster is moved to Zookeeper, that is zookeeper has the responsibility of tracking Region Server status. Moreover, Zookeeper is the entry point of client, which enable query Zookeeper about the location of the region hosting the –ROOT- table.
HBase Logical Architecture
.
Region Server Architecture
.
Region Server Architecture
.It contains several components as follows:1.One Block Cache, which is a LRU priority cache for data reading.
2. One WAL(Write Ahead Log): HBase use Log-Structured-Merge-Tree(LSM tree) to process data writing. Each data update or delete will be write to WAL first, and then write to MemStore. WAL is persisted on HDFS.
3. Multiple HRegions: each HRegion is a partition of table as we talk about in 3.3.1.
4. In a HRegion: Multiple HStore: Each HStore is correspond to a Column Family
5. In a HStore: One MemStore: store updates or deletes before flush to disk. Multiple StoreFile, each of which is correspond to a HFile
6. A HFile is immutable, flushed from MemStore, persisted on HDFS
-ROOT- and .META table
.
-ROOT- and .META table
.
There are two special catalog tables, -ROOT- and .META. table for this.
1.META. table: host the region location info for a specific row key range. The table is stored on Region Servers, which can be split into as many region as required.
2.ROOT- table: host the .META. table info. There is only one Region Server store the –ROOT- table. And the Root region never split into more than one region.
The RegionServer RS1 host the –ROOT- table, the .META. table is split into 3 regions: M1, M2, M3, hosted on RS2, RS3, RS1. Table T1 contains three regions, T2 contains four regions. For example, T1R1 is hosted on RS3, the meta info is hosted on M1.
Region Lookup
.
Region Lookup
.
1. Client query Zookeeper: where is the –ROOT-? On RS1.
2. Client request RS1: Which meta region contains row: T10006? META1 on RS2
3. Client request RS2: Which region can find the row T10006? Region on RS3
4. Client get the from the region on RS3
5. Client cache the region info, and is refreshed until the region location info changed.
HBase Write Path
.
HBase Write Path
.
The client doesn’t write data directly into HFile on HDFS. Firstly it writes data to WAL(Write Ahead Log), and Secondly, writes to MemStore shared by a HStore in memory.
MemStore is a write buffer(64MB by default). When the data in MemStore accumulates its threshold, data will be flush to a new HFile on HDFS persistently. Each Column Family can have many HFiles, but each HFile only belongs to one Column Family.WAL is for data reliability, WAL is persistent on HDFS and each Region Server has only on WAL. When the Region Server is down before MemStore flush, HBase can replay WAL to restore data on a new Region Server.A data write completes successfully only after the data is written to WAL and MemStore.
HBase Read Path
.
HBase Read Path
. 1. Client will query the MemStore in memory, if it has the target row.
2. When MemStore query failed, client will hit the BlockCache.
3. After the MemStore and BlockCache query failed, HBase will load HFiles into memory which may contain the target row info.
4. The MemStore and BlockCache is the mechanism for real time data access for distributed large data.BlockCache is a LRU(Lease Recently Used) priority cache. Each RegionServer has a single BlockCache. It keeps frequently accessed data from HFile in memory to reduce disk data reads. The “Block”(64KB by default) is the smallest index unit of data or the smallest unit of data that can be read from disk by one pass.For random data access, small block size is preferred, but block index consumes more memory. And for sequential data access, large block size is better, fewer index save more memory.
Deep Dive In Hbase Architecture
.
HFILE
.
HFILE
. The HFile implements the same features as SSTable, but may provide more or less
1. File Formata.Data Block SizeThe size of each data block is 64KB by default, and is configurable in Hfile.b. Maximum Key LengthThe key of each key/value pair is currently up to 64KB in size. 10-100 bytes is a typical size Even in the data model of HBase, the key (rowkey+column family:qualifier+timestamp) should not be too long.c. Compression AlgorithmHFile supports following three algorithms:(1)NONE(2)GZ(3)LZO(Lempel-Ziv-Oberhumer)
HFILE
.
HFile is separated into multiple segments, from beginning to end, they are:- Data Block segmentTo store key/value pairs, may be compressed.- Meta Block segment (Optional)To store user defined large metadata, may be compressed.- File Info segmentIt is a small metadata of the HFile, without compression. User can add user defined small metadata (name/value) here.- Data Block Index segmentIndexes the data block offset in the HFile. The key of each index is the key of first key/value pair in the block.- Meta Block Index segment (Optional)Indexes the meta block offset in the HFile. The key of each index is the user defined unique name of the meta block.- TrailerThe fix sized metadata. To hold the offset of each segment, etc. To read an HFile, we should always read the Trailer firstly.
HFILE Compaction
.
HFILE Compaction
.
Minor CompactionIt happens on multiple HFiles in one HStore. Minor compaction will pick up a couple of adjacent small HFiles and rewrite them into a larger one.The process will keep the deleted or expired cells. The HFile selection standard is configurable. Since minor compaction will affect HBase performace, there is an upper limit on the number of HFiles involved (10 by default).
Major CompactionMajor Compaction compact all HFiles in a HStore(Column Family) into one HFile. It is the only chance to delete records permanently. Major Compaction will usually have to be triggered manually for large clusters.Major Compaction is not region merge, it happens to HStore which will not result in region merge.
HBase Delete
.
When HBase client send delete request, the record will be marked “tombstone”, It is a “predicate deletion”, which is supported by LSM-tree. Since HFile is immutable, deletion isn’t available for HFile on HDFS. Therefore, HBase adopts major compaction to clean up deleted or expired records.
Starting HBase daemons and shellExecute the command: start-hbase.shThis command starts the hbase daemons.
Execute the command: hbase shellThis starts the command line interface of Hbase
Creating tables in HBaseTo create a table in HBase, do the following: • Specify the table name and column families.
Note: HBase has a dynamic schema. Thus while creating table we mention just the table name and the column families. At least on column family must be mentioned during the creation of table.
• Execute the command: create 'table_name','column_family1'...'column_familyN’
Inserting rowsTo insert rows in HBase, do the following: • Specify the table_name.row key.column with the value to be insertedNote: Hbase stores data in key and values.
• Execute the command: create 'table_name','row_key','columnFamily:column','value'
Scanning tablesTo perform a full scan on HBase, do the following: • Specify scan ‘table_name’ in the Hbase promptHBase displays row key, time stamp and its corresponding values.
• Execute the command: scan 'table_name'
Fetching a single row
To fetch a single row in HBase, do the following: • Specify ‘get table_name.row_key’ in the HBase promptHbase displays row key, time stamp and its corresponding values.
• Execute the command: get 'table_name','row_key'
Listing all tablesTo list all the tables in HBase, do the following: • All the tables present in Hbase is listed by specifying the command ‘list’.
• Execute the command: list
DescribeTo see the meta data associated with a table in HBase, do the following: • Complete meta data of a table can be seen by specifying the table name.
• Execute the command: describe 'table_name'
HBase with Hive
. 1 . Create HBase table
create 'hivehbase', 'ratings'put 'hivehbase', 'row1', 'ratings:userid', 'user1'put 'hivehbase', 'row1', 'ratings:bookid', 'book1'put 'hivehbase', 'row1', 'ratings:rating', '1' put 'hivehbase', 'row2', 'ratings:userid', 'user2'put 'hivehbase', 'row2', 'ratings:bookid', 'book1'put 'hivehbase', 'row2', 'ratings:rating', '3' put 'hivehbase', 'row3', 'ratings:userid', 'user2'put 'hivehbase', 'row3', 'ratings:bookid', 'book2'put 'hivehbase', 'row3', 'ratings:rating', '3' put 'hivehbase', 'row4', 'ratings:userid', 'user2'put 'hivehbase', 'row4', 'ratings:bookid', 'book4'put 'hivehbase', 'row4', 'ratings:rating', '1'
HBase with Hive
. 2. Create Hive external table
CREATE EXTERNAL TABLE hbasehive_table(key string, userid string,bookid string,rating int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,ratings:userid,ratings:bookid,ratings:rating")TBLPROPERTIES ("hbase.table.name" = "hivehbase");
3. Querying HBase via Hiveselect * from hbasehive_table; OKrow1 user1 book1 1row2 user2 book1 3row3 user2 book2 3row4 user2 book4 1
HBase Bulk Load Using PIG
. DATASET
Custno, firstname, lastname, age, profession 4000001,Kristina,Chung,55,Pilot 4000002,Paige,Chen,74,Teacher 4000003,Sherri,Melton,34,Firefighter 4000004,Gretchen,Hill,66,Computer hardware engineer 4000005,Karen,Puckett,74,Lawyer 4000006,Patrick,Song,42,Veterinarian 4000007,Elsie,Hamilton,43,Pilot 4000008,Hazel,Bender,63,Carpenter 4000009,Malcolm,Wagner,39,Artist
HBase Bulk Load Using PIG
. # Create a table ‘customers’ with column family ‘customers_data’
hbase(main):001:0> create 'customers', 'customers_data’
Write the following PIG script to load data into the ‘customers’ table in Hbase
raw_data = LOAD '/customers' USING PigStorage(',') AS ( custno:chararray, firstname:chararray, lastname:chararray, age:int, profession:chararray );
STORE raw_data INTO 'hbase://customers' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'customers_data:firstname customers_data:lastname customers_data:age customers_data:profession' );
HBase Bulk Load Using ImportTSV
.
In HBase-speak, bulk loading is the process of preparing and loading HFILES directly into the RegionServers, thus bypassing the write path . It includes 3 steps :1.Extract the data from a source, typically text files or another database2. Transform the data into HFiles3. Load the files into HBase by telling the RegionServers where to find them.
HBase Bulk Load Using ImportTSV
.
STEP :1 First load data into HDFS.Hadoop fs –mkdir /user/training/data_setHadoop fs -put data_set /user/training/data
STEP :2 Create Hbase table .create 'FlappyTwit', {NAME => 'f'}, {SPLITS => ['g', 'm', 'r', 'w
STEP :3 Convert plain files to HFILE.hbaseorg.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.bulk.output=/user/training/output -Dimporttsv.columns=HBASE_ROW_KEY,f:username,f:followers,f:count,f:tweet1,f:tweet2,f:tweet3,f:tweet4,f:tweet5 FlappyTwit /user/training/FlappyTwit/FlappyTwit-Small.txt
STEP :4 Load HFILE into Hbasehbaseorg.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles /user/training/output FlappyTwit
HBase with Java
. DATASET
1,India,Haryana,Chandigarh,2009,April,P1,1,52,India,Haryana,Ambala,2009,May,P1,2,103,India,Haryana,Panipat,2010,June,P2,3,154,United States,California,Fresno,2009,April,P2,2,55,United States,California,Long Beach,2010,July,P2,4,106,United States,California,San Francisco,2011,August,P1,6,20
USECASEFollowing column families have to be created “sample,region,time.product,sale,profit”Column family region has three column qualifiers : country, state, cityColumn family Time has two column qualifiers : year, month
HBase with MapReduce
. USECASE
Hbase has records of web_access_logs. We record each web page access by a user.To keep things simple, we are only logging the user_id and the page they visit.
The schema looks like this:userID_timestamp =>{details => {page:}}
To make row-key unique, we have in a timestamp at the end making up acomposite key
HBase with MapReduce
. SAMPLE DATA
ROW PAGESUSER1_T1 a.HtmlUSER2_T2 b.HtmlUSER3_T3 c.html
OUTPUT:we want to count how many times we have seen each user
USER COUNTUSER1 3USER2 2USER3 1
HBase with MapReduce
.
create 'access_logs', 'details' create 'summary_user', {NAME=>'details', VERSIONS=>1}
MAPPER
INPUT OUTPUTImmutableBytesWritable(RowKey = userID + timestamp)
ImmutableBytesWritable(userID)
Result(Row Result) IntWritable(always ONE)REDUCER
INPUT OUTPUTImmutableBytesWritable(uesrID)
ImmutableBytesWritable(userID : same as input)
Iterable<IntWriable>(all ONEs combined for this key)
IntWritable(total of all ONEs for this key)
Conclusion
• Provides near-real time access to HDFS
• Provides a transaction-like data store/database on top of HDFS
• Provides a highly scalable database
Thank You