meet hbase 1.0

48
Meet HBase-1.0 And the New Client API Enis Soztutar Solomon Duskis

Upload: enissoz

Post on 07-Aug-2015

906 views

Category:

Technology


0 download

TRANSCRIPT

Meet HBase-1.0And the New Client API

Enis SoztutarSolomon Duskis

About Us

Enis SöztutarHortonworks

Release Manager for 1.0 @enissoz

Solomon Duskis Google / Bigtable@sduskis

Outline

Why now?Major featuresVersioning / CompatibilityUpgradeHBase-1.0 InterfacesExamples

Outline

Why now?Major featuresVersioning / CompatibilityUpgradeHBase-1.0 InterfacesExamples

Why 1.0 now?Ran out of numbers, which was the plan for switching to the 0.9x versions

Community agreement that HBase has already reached the maturity level

Start semantic versioning and compatibility guarantees

Apache HBase v1.0 marks a major milestone in the project's development. It is a monumental moment that the army of contributors who have made this possible should all be proud of. The result is a thing

of collaborative beauty that also happens to power key, large-scale Internet platforms.

Michael Stack

The HBase 1.0 release appropriately acknowledges a maturity already achieved by the Apache HBase community and software both, and is a great occasion to learn more about HBase, how it can

help you solve your scale data challenges, and the growing ecosystem of Open Source and commercial software that chooses HBase as foundation.

Andrew Purtell

https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72

Release goals

The 1.0.0 release has three goals:

Release goals

The 1.0.0 release has three goals:1. Lay a stable foundation for future 1.x

releases

Release goals

The 1.0.0 release has three goals:1. Lay a stable foundation for future 1.x

releases2. Stabilize running HBase cluster and its

clients; and

Release goals

The 1.0.0 release has three goals:1. Lay a stable foundation for future 1.x

releases2. Stabilize running HBase cluster and its

clients; and3. Make versioning and compatibility

dimensions explicit

Outline

Why now?Major featuresVersioning / CompatibilityUpgradeHBase-1.0 InterfacesExamples

Overview

Over 1500 jiras resolved on top of 0.98.0!

See release announcement for a comprehensive summary

API overhaulIntroduced new base interfaces

Client API is explicitly marked

Javadoc for client side is separated

Client API will have source compat in 1.x

Read availability with region replicasPhase 1 of “region replicas” feature. (Phase 2 in 1.1)

Each region can have “replicas” hosted in other RSs

Only primary accepts writes

Reads can be performed with STRONG or TIMELINE consistency

Online config change

Configuration can be updated while the region server is running

hbase> update_all_config

hbase> update_master_config

hbase> update_config ‘<serverName>’

Only some configs can be update onlinesome compaction / load balancer configs for now

Other forward ports from 0.89-fb branch

New and noteworthyExtensive documentation/website improvementsAutomatic tuning of global memstore and block cache sizesBucket cache easier to configureCompressed blocks in the block cachePluggable replication endpointBasic client backpressure mechanism

New and noteworthy cont.Docker file Per-cell TTLCopyTable with --bulkloadTruncate table commandAtomic Table.checkAndMutate()Namespace permissions

Under the coversCell based read/write pathRing buffer based WAL improvementsMulti WAL files in HRegionServerZK-less assignment (disabled by default)Client Preemptive Fast FailCombining mvcc and seqIds Various security, tags and visibility labels improvementsVarious fixes to REST serverNumerous improvements in other areas and bug fixes too long to list here.

Changes in behavior: JDK

✓*: should work, but not well testedhttps://hbase.apache.org/book.html#basic.prerequisites

JDK Version HBase-1.1 HBase-1.0 HBase-0.98

JDK 6 ✗ ✗ ✓

JDK 7 ✓ ✓ ✓

JDK 8 ✓* ✓* ✓*

Changes in behavior: HadoopHadoop Version HBase-1.1 HBase-1.0 HBase-0.98

Hadoop-1.x ✗ ✗ ✓*

Hadoop-2.2 ✗ ✓* ✓*

Hadoop-2.3 ✓* ✓* ✓

Hadoop-2.4 ✓ ✓ ✓

Hadoop-2.5 ✓ ✓ ✓

Hadoop-2.6 ✓ ✓ ✓*

✓*: should work, but not well testedhttps://hbase.apache.org/book.html#basic.prerequisites

Changes in behaviorZookeeper-3.4.x is requiredDefault ports changed to 160XX (out of ephemeral range)Hfile v3 is defaultSlab cache removedDefault heap is ¼ of physical memory (instead of 1GB)

Outline

Why now?Major featuresVersioning / CompatibilityUpgradeHBase-1.0 InterfacesExamples

Semantic VersioningStarting with the 1.0.0 release, HBase works toward Semantic Versioning

MAJOR.MINOR.PATCH[-identifiers] PATCH: only BC bug fixes. MINOR: BC new features MAJOR: Incompatible changes

Post 1.0 versions

New versioning already in action● 1.0.0 ● 1.0.1 (patch release)● 1.1.0 (minor release)

1.0.x and 1.1.x is expected to have ~monthly releases

1.2.0 and 2.0.0 in the works

HBase API surfaceClient API

Explicitly marked with InterfaceAudience.PublicGet/Put/Table/Connection, etc

LimitedPrivate APIExplicitly marked with InterfaceAudience.LimitedPrivateCoprocessors, replication APIs

Private APIExplicitly marked with InterfaceAudience.PrivateAll other classes not marked

Also InterfaceAudience.{Stable,Evolving,Unstable}

Major Minor Patch

Client-Server Wire Compatibility ✗ ✓ ✓

Server-Server Compatibility ✗ ✓ ✓

File Format Compatibility ✗* ✓ ✓

Client API Compatibility ✗ ✓ ✓

Client Binary Compatibility ✗ ✗ ✓

Server Side Limited API C. ✗ ✗*/✓* ✓

Dependency Compatibility ✗ ✓ ✓

Operation Compatibility ✗ ✗ ✓

1.0.x Compatibility with earlier: Source

1.0.x is (mostly) source compatible with earlier versionsFilter / Coprocessor users will see some changesWe strongly advise ALL users to switch to new APIDeprecated APIs will be removed (in 2.0)

1.0.x Compatibility with earlier: Binary

1.0 is NOT binary compatible with earlier versionsClients/coprocessors have to be recompiled to link against 1.0 jarsCannot drop/replace jars against an application compiled with 0.98

1.0.x Compatibility with earlier: Wire

1.0.x is wire compatible with 0.98.x releases0.98.x client can be used to access 1.0.x cluster (allows rolling upgrades)NOT binary compatible with earlier (0.96,0.94)HFile v3 is default. Once upgraded, cannot “go back”

Outline

Why now?Major featuresVersioning / CompatibilityUpgradeHBase-1.0 InterfacesExamples

Upgrade to 1.0.xFrom 0.98.x

Regular upgrade or rolling upgrade fashion is supported.From 0.96.x

Supported with a shutdown and restart of the cluster. No rolling upgrades. No need to run extra steps/scripts.

From 0.94.xSupported similarly to upgrade from 0.94 -> 0.96. The upgrade script should be run to rewrite cluster level metadata.

From earlier versions (0.92,0.90,etc) upgrade is not supported

HBase 1.0 Interfaces

Better encapsulation

Why the new interfaces?HBase 1.0 had a goal to create new client interfaces

● Explicit contracts - Clear definition of the surface● Defining a standard API in the code● Clearer focus of responsibility - each piece doing one

thing well.

Naming Overview HBase 0.98 Name(s) HBase 1.0 name(s)

HConnectionManager, ConnectionManager ConnectionFactory

HConnection, ClusterConnection Connection

HBaseAdmin Admin

HTable TableRegionLocatorBufferedMutator

ConnectionFactory

Creates new Connections.Use this instead of new HTable(), new HBaseAdmin()User must manage ConnectionsFYI: Connection type can be overridden in the Configuration

Managed Connections Going AwayHBase Client used to have implicit connection management.

Managed Connections was trying to do lifecycle management without understanding the application, sometimes with unpredictable results.

HBase 1.0 introduces explicit Connection management.

Connection

Simple replacement for HConnectionFocal point to get a Table, RegionLocator, Admin, or BufferedMutatorUse TableName instead of String/byte[]User Managed - must call connection.close()

Connections have a cache of region metadata and a shared threadpool; close() releases shared resources.

Admin

Replaces HBaseAdmin for administration Functionality

create/delete/list Table and Snapshots, split table, add/remove table columns and etc

Retrieved via connection.getAdmin().Use TableName object instead of String/byte[]Remember to .close()

RegionLocator

Region metadata related functionalityget start/end keys, get all regions, get region for qualifierNo manipulation of regions. That’s in Admin.

Lightweight - uses cached region information from connectionRemember to .close()

Table (part I)

Most of HTable’s methods - CRUDput, delete, get - both single and listincrement, appendscanbatchcheckAnd*coprocessor service

Table (Part II)

Removed autoflushThe autoflush functionality was complex and used for batch writes. BufferedMutator was introduces for that purpose.

One Table per threadRemember to close()

Release the threadpool

BufferedMutator (part I)

Autoflush and BufferedMutator are used when “writes are small and many; it especially makes sense when there is no natural flush point.” --stack on HBASE-12728Supports all Batches Mutations

Puts were supported before.Adds batched Deletes, Appends, Increments, RowMutations

BufferedMutator (part II)

Used in Map/ReducesCan be used in high performance servlets, if you can tolerate some data loss.Use ExceptionListenerCLOSE!

does a flush() - You might lose data in the bufferalso closes threadpools

Outline

Why now?Major featuresVersioning / CompatibilityUpgradeHBase-1.0 InterfacesExamples

Old wayTableName tableName = TableName.valueOf(tableNameString);HBaseAdmin admin = new HBaseAdmin(config)HTableDescriptor descriptor = …;admin.createTable(descriptor);admin.close();

HTable table = new HTable(tableName, config); … // do something interestingtable.close();

Table exampleTableName tableName = TableName.valueOf(tableNameString);try (Connection conn = ConnectionFactory.createConnection(); Admin admin = conn.getAdmin();) { HTableDescriptor descriptor = …; admin.createTable(descriptor); try (Table table = conn.getTable(tableName)) { table.put(...); ... }}

Other examplesTableName tableName = TableName.valueOf(tableNameString);try (Connection conn = ConnectionFactory.createConnection()) { try (BufferedMutator mutator = conn.getBufferedMutator(tableName)) { mutator.mutate(...); } try (RegionLocator locator = conn.getRegionLocator(tableName)) { List<HRegionLocation> locations = locator.getAllRegionLocations(); ... }}