bigtable - wordpress.com · bigtable provides a high scalability, high performance, high...

41
A Distributed Storage System for Structured Data Bigtable Presenter: Yunming Zhang Conglong Li Saturday, September 21, 13

Upload: others

Post on 05-Jan-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

A Distributed Storage System for Structured Data

Bigtable

Presenter:Yunming Zhang

Conglong Li

Saturday, September 21, 13

Page 2: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

References

SOCC 2010 Key Note SlidesJeff Dean Google

Introduction to Distributed Computing, Winter 2008University of Washington

2Saturday, September 21, 13

Page 3: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Motivation

Lots of (semi) structured data at GoogleURLs

Contents, crawl metadata, linksPer-user data:

User preference settings, search resultsScale is large

Billions of URLs, hundreds of million of users,Existing Commercial database doesn’t meet the requirements

3Saturday, September 21, 13

Page 4: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Store and manage all the state reliably and efficientlyAllow asynchronous processes to update different pieces of data continuously

Very high read/write ratesEfficient scans over all or interesting subsets of data

Often want to examine data changes over time

Goals

4Saturday, September 21, 13

Page 5: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

BigTable vs. GFS

GFS provides raw data storageWe need:

More sophisticated storageKey - value mapping

Flexible enough to be usefulStore semi-structured dataReliable, scalable, etc.

5Saturday, September 21, 13

Page 6: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

BigTable

Bigtable is a distributed storage system for managing large scale structured data

Wide applicabilityScalabilityHigh performanceHigh availability

6Saturday, September 21, 13

Page 7: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Overview

Data ModelAPIImplementation StructuresOptimizationsPerformance EvaluationApplicationsConclusions

7Saturday, September 21, 13

Page 8: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Data Model

SparseSortedMultidimensional

8Saturday, September 21, 13

Page 9: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Cell

Contains multiple versions of the data

Can locate a data using row key, column key and a time stamp

Treats data as uninterpreted array of bytes that allow clients to serialize various forms of structured and semi-structured data

Supports automatic garbage collection per column family for management of versioned data

9Saturday, September 21, 13

Page 10: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Store and manage all the state reliably and efficientlyAllow asynchronous processes to update different pieces of data continuously

Very high read/write ratesEfficient scans over all or interesting subsets of data

Often want to examine data changes over time

Goals

10Saturday, September 21, 13

Page 11: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Row

Row key is an arbitrary stringAccess to column data in a row is atomic

Row creation is implicit upon storing dataRows ordered lexicographically

Rows close together lexicographically usually reside on one or a small number of machines

11Saturday, September 21, 13

Page 12: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Columns

Columns are grouped into Column Families:family:optional_qualifier

Column familyHas associated type informationUsually of the same type 12

Saturday, September 21, 13

Page 13: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Overview

Data ModelAPIImplementation StructuresOptimizationsPerformance EvaluationApplicationsConclusions

13Saturday, September 21, 13

Page 14: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

API

Metadata operationsCreate/delete tables, column families, change metadata, modify access control list

Writes ( atomic )Set (), DeleteCells(), DeleteRow()

ReadsScanner: read arbitrary cells in a BigTable

14Saturday, September 21, 13

Page 15: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Overview

Data ModelAPIImplementation StructuresOptimizationsPerformance EvaluationApplicationsConclusions

15Saturday, September 21, 13

Page 16: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Tablets

Large tables broken into tablets at row boundariesTablet holds contiguous range of rows

Clients can often choose row keys for localityAim for ~100MB to 200MB of data per tablet

Serving machine responsible for ~100 tabletsFast recovery:

100 machine each pick up 1 tablet from failed machine

Fine-grained load balancing:Migrate tablets away from overloaded machine

16Saturday, September 21, 13

Page 17: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Tablets and Splitting

Saturday, September 21, 13

Page 18: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

System Structure

MasterMetadata operationsLoad balancingKeep track of live tablet serversMaster failure

Tablet serverAccept read and write to data

18Saturday, September 21, 13

Page 19: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

System Structure

Saturday, September 21, 13

Page 20: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

System Structure

read/write

Saturday, September 21, 13

Page 21: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

System Structure

Metadata operations

Saturday, September 21, 13

Page 22: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Locating Tablets

3-level hierarchical lookup scheme for tabletsLocation is ip port of servers in META tables

22Saturday, September 21, 13

Page 23: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Tablet Representationand serving

Append only tablet logSSTable on GFS

A Sorted map of string to stringIf you want to find a row data, all the data are contiguous

Memtable write bufferWhen a read comes in, you have to merge SSTable data and uncommitted value.

23Saturday, September 21, 13

Page 24: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Tablet Representationand Serving

24Saturday, September 21, 13

Page 25: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Tablet Representationand Serving

25Saturday, September 21, 13

Page 26: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Compaction

Tablet state represented as a set of immutable compacted SSTable files, plus tail of log

Minor compaction:When in-memory buffer fills up, it freezes the in-memory buffer and create a new SSTable

Major compaction:Periodically compact all SSTables for tablet into new base SSTable on GFS

Storage reclaimed from deletions at this point

Produce new tables 26

Saturday, September 21, 13

Page 27: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Overview

Data ModelAPIImplementation StructuresOptimizationsPerformance EvaluationApplicationsConclusions

27Saturday, September 21, 13

Page 28: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Reliable system for storing and managing all the statesAllow asynchronous processes to update different pieces of data continuously

Very high read/write ratesEfficient scans over all or interesting subsets of data

Often want to examine data changes over time

Goals

28Saturday, September 21, 13

Page 29: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Locality Groups

Clients can group multiple column families together into a locality group

A separate SSTable is generated for each locality group

Enable more efficient readCan be declared to be in-memory

29Saturday, September 21, 13

Page 30: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Compression

Many opportunities for compressionSimilar values in columns and cells

Within each SSTable for a locality group, encode compressed blocks

Keep blocks small for random access Exploit fact that many values very similar

30Saturday, September 21, 13

Page 31: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Reliable system for storing and managing all the statesAllow asynchronous processes to update different pieces of data continuously

Very high read/write ratesEfficient scans over all or interesting subsets of data

Often want to examine data changes over time

Goals

31Saturday, September 21, 13

Page 32: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Commit log and recovery

Single commit log file per tablet serverreduce the number of concurrent file writes to GFS

Tablet Recoveryredo points in log perform the same set of operations from last persistent state

32Saturday, September 21, 13

Page 33: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Overview

Data ModelAPIImplementation StructuresOptimizationsPerformance EvaluationApplicationsConclusions

33Saturday, September 21, 13

Page 34: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Performance evaluation

Test EnvironmentBased on a GFS with 1876 machines400 GB IDE hard drives in each machineTwo-level tree-shaped switched network

Performance TestsRandom Read/WriteSequential Read/Write

34Saturday, September 21, 13

Page 35: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Single tablet-server performance

Random reads is the slowestTransfer 64 KB SSTable over GFS to read 1000 byte

Random and sequential writes perform betterAppend writes to server to a single commit logGroup commit

35Saturday, September 21, 13

Page 36: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Performance Scaling

Performance didn’t scale linearlyLoad imbalance in multiple server configurationsLarger data transfer overhead

36Saturday, September 21, 13

Page 37: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Overview

Data ModelAPIImplementation StructuresOptimizationsPerformance EvaluationApplicationsConclusions

37Saturday, September 21, 13

Page 38: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Google Analytics

A service that analyzes traffic patterns at web sitesRaw Click Table

Row for each end-user sessionRow key is (website name, time)

Summary TableExtracts recent session data using MapReduce jobs

38Saturday, September 21, 13

Page 39: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Google Earth

Use one table for preprocessing and one for servingDifferent latency requirements (disk vs memory)

Each row in the imagery table represents a single geographic segment

Column family to store data sourceOne column for each raw imageVery sparse

39Saturday, September 21, 13

Page 40: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Personalized Search

Row key is a unique useridA column family for each type of user actionReplicated across Bigtable clusters to increase availability and reduce latency

40Saturday, September 21, 13

Page 41: Bigtable - WordPress.com · Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write

Conclusions

Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data.

It provides a low level read / write based interface for other frameworks to build on top of it

It has enabled Google to deal with large scale data efficiently

41Saturday, September 21, 13