scaledb - percona · 2017-05-12 · scaledb-admin add node casv1s0 server 192.168.0.156...

57
ScaleDB MySQL Scalability with Real Time Streaming

Upload: others

Post on 24-Mar-2020

32 views

Category:

Documents


0 download

TRANSCRIPT

ScaleDBMySQL Scalability with Real Time Streaming

What Is ScaleDB?

3

What Is ScaleDB?

ScaleDB is a database company

4

What Is ScaleDB?

ScaleDB is a database company

Today:Streaming Data & Real Time Analytics

ScaleDB is a Database Engine, we call it UDE - Universal Data EngineScaleDB is also a Storage Engine - this is what you see with MariaDB

5

What Is ScaleDB?

ScaleDB is a database company

Today:Streaming Data & Real Time Analytics

ScaleDB is a Database Engine, we call it UDE - Universal Data EngineScaleDB is also a Storage Engine - this is what you see with MariaDB

The Architecture:Suitable for OLTP & OLAP, distributed shared-data system

6

The ScaleDB Database Engine Architecture• Distributed,

Transactional,Shared-data architecture

• Always on, highly available cluster with redundant components

• Scale-up and Scale-out capabilities

7

What Is ScaleDB Today?

A fast, time series engine• Distributed• Highly MPP• Allow multiple row INSERTs, LOAD DATA INFILE, mysqlimport etc.• FIFO DELETE• Table size (Number of rows) does not matter for performance and

scalability• Scale-Out• Indexed• Engine push-down conditions and analytics• Relational

8

What Is NOT ScaleDB Today?

• No ALTER TABLE

• No Geo Replication

• Not all the queries can be pushed-down

• No re-balancing

• No all the SQL join features

• No cost-based optimizer at storage level

• No compression

• No LOBS, JSON or unstructured data

9

What is the ScaleDB Licensing?

• Currently, it is a free-to-use, closed source license

• Customers pay for technical support, NRE and other services

We are working hard to open the code,

under GPLv2

10

ScaleDB under the hood

• A database engine with its own metadata, written in C and C++• An intelligent storage engine that accepts pushdown conditions and

executes analytic functions. • A distributed Lock Manager with an arbitrator that handles conflict

requests • A client API that connects to the storage and lock manager nodes in

the cluster• A map/reduce-like algorithm to distribute queries and retrieve

results from the storage nodes• A MySQL Storage Engine API wrapper that interacts with

the client API

11

Versioning and Releases

• Aggressive 6 months release cycle• Monthly patches

12

Versioning and Releases

• Aggressive 6 months release cycle• Monthly patches

• 15.10 - Ambitious Ararat

• System Range Keys(Database Timestamp)

• It Works!• ...and it’s fast!

13

Versioning and Releases

• Aggressive 6 months release cycle• Monthly patches

• 15.10 - Ambitious Ararat• 16.04 - Bountiful Baintha

• User Range Keys(Application Timestamp)

• Admin scripts

14

Versioning and Releases

• Aggressive 6 months release cycle• Monthly patches

• 15.10 - Ambitious Ararat• 16.04 - Bountiful Baintha• 16.10 - Chilly Chimborazo

• Fixed strange INSERT/SELECT behaviour• Admin and Docker deployment• HA with No Single Point of Failure• Replace Storage node• Add Volume

15

Versioning and Releases

• Aggressive 6 months release cycle• Monthly patches

• 15.10 - Ambitious Ararat• 16.04 - Bountiful Baintha• 16.10 - Chilly Chimborazo• 17.04 - Dangerous Damavand

• UTF-8 support• FLOAT, DECIMAL, Fractional Seconds• Re-engineered pushdown filters• Many bugs fixed

16

Docker Deployment

scaledb-admin set environment docker

scaledb-admin add node db0 server 192.168.0.156

scaledb-admin add node db1 server 192.168.0.157

scaledb-admin add node slm0 server 192.168.0.156

scaledb-admin add node slm1 server 192.168.0.157

scaledb-admin add node casv0s0 server 192.168.0.156

scaledb-admin add node casv0s1 server 192.168.0.157

scaledb-admin add node casv1s0 server 192.168.0.156

scaledb-admin add node casv1s1 server 192.168.0.157

scaledb-admin set node db0 type db

scaledb-admin set node db1 type db

scaledb-admin set node slm0 type slm active

scaledb-admin set node slm1 type slm standby

scaledb-admin set node casv0s0 type cas volume 0 storage 0

scaledb-admin set node casv0s1 type cas volume 0 storage 1

scaledb-admin set node casv1s0 type cas volume 1 storage 0

scaledb-admin set node casv1s1 type cas volume 1 storage 1

scaledb-admin boot

scaledb-admin init force

scaledb/node image

17

ScaleDB Cluster

• Database Nodes

• Storage Nodes (Storage Volumes)

• Lock Manager (Cluster Manager)

18

ScaleDB ONE (One Node Edition)

ScaleDBNative API

MariaDB Server(Database Node)

ScaleDB Storage Engine

SLMScaleDB

LockManager

Storage Node

19

ScaleDB Cluster - Scalability

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

SLMScaleDB

LockManager Storage

NodeStorage

Node

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

20

ScaleDB Cluster - High Availability

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

SLMScaleDB

LockManager

SLMScaleDB

LockManager

Storage Volume

Storage Node

Storage Node

Storage Volume

Storage Node

Storage Node

21

ScaleDB Cluster - High Availability

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

SLMScaleDB

LockManager

SLMScaleDB

LockManager

Storage Volume

Storage Node

Storage Node

Storage Volume

Storage Node

Storage Node

22

ScaleDB Cluster - Elasticity/Designed for the Cloud

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

MariaDB Server

ScaleDBNative API

ScaleDB Storage Engine

3rd PartyApplication

ScaleDBNative API

SLMScaleDB

LockManager

SLMScaleDB

LockManager

Storage Volume

Storage Node

Storage Node

Storage Volume

Storage Node

Storage Node

Storage Volume

Storage Node

Storage Node

Storage Volume

Storage Node

Storage Node

23

ScaleDB Components

mysqld

libscaledb.so

ha_scaledb.so

3rd partyApplication

SLMScale

scaledb_slm

scaledb_cas

scaledb_cas

scaledb_cas

scaledb_cas

scaledb_cas

scaledb_cas

scaledb_cas

scaledb_cas

libscaledb.so

UD

E -

Un

iver

sal D

ata

Engi

ne

24

ScaleDB Metadata and API

Streaming Tables

26

Streaming Tables in a Nutshell• Massive Parallel Processing

• Node distribution• Multi-thread scan

• Timestamp-based columns• System Time• User (Application) Time

• Indexing:• Primary Key - Hash, unique• [Sequence] Range Key - Hash with

sequence/block optimisation• Hash Key - Hash, non-unique

• Joins• Cross storage engine,

Using subqueries

• Optimised for high ingestion rates and real time analytics• Sequential storage

slow spinning disks are OK• Currently optimised for

multiple-row INSERTs

• Queries with:• Condition pushdown• In-storage analytic functions• Currently limited in WHERE

functionalities

• Map/Reduce approach• Queries sent to all storage nodes• Results returned and consolidated

27

Streaming Tables Use Cases

• Clickstream (Web, Mobile, etc.)• Recommendation Engines• Telco / Call Centres• Gaming• Financial Data• Data Logging• IoT • Security ...and much more

28

Creating Streaming Tables

CREATE TABLE TABLE_NAME ( id_name BIGINT UNSIGNED AUTO_INCREMENT NOT NULL [PRIMARY KEY], sys_time_col {TIMESTAMP|INTEGER UNSIGNED} NOT NULL [DEFAULT default_value], [usr_time_col {TIMESTAMP|INTEGER UNSIGNED} NOT NULL [DEFAULT default_value],] [[idx_col {SMALLINT|MEDIUMINT|INTEGER|BIGINT} [UNSIGNED] NOT NULL [DEFAULT default_value],]...] [<other non-indexed columns>,] [PRIMARY KEY [pk_name] ( id_name ),] KEY [sys_idx_name] ( sys_time_col ) RANGE_KEY = SYSTEM[,] [KEY [usr_idx_name] ( usr_time_col ) RANGE_KEY = USER [RANGE_INTERVAL = {WEEK|DAY|HOUR|MINUTE|SECOND|CODED}] [RANGE_START = datatype_value]] [, [KEY [hsh_idx_name] ( idx_col ) HASH_KEY = YES HASH_SIZE = integer_value],...] ) ENGINE = ScaleDB TABLE_TYPE = STREAMING

29

CREATE TABLE payment_scaledb_streaming (

sale_id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, create_time TIMESTAMP NOT NULL, payment_time TIMESTAMP NOT NULL, account CHAR(8) NOT NULL DEFAULT '0', store INT(10) UNSIGNED NOT NULL DEFAULT '0', coupon CHAR(7) NOT NULL DEFAULT '', amount DECIMAL(8,2) NOT NULL, KEY account (account) HASH_KEY = YES HASH_SIZE = 1000, KEY create_time (create_time) RANGE_KEY = SYSTEM, KEY payment_time (payment_time) RANGE_KEY = USER ) ENGINE = ScaleDB TABLE_TYPE = STREAMING DEFAULT CHARSET=latin1;

Creating Streaming Tables

30

Writing Streaming Data

Storage Node

Storage Node

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Storage Node Storage

Node

Storage Node Storage

Node

Storage Node

Load Load LoadLoad

31

Writing Streaming Data

Storage Node

Storage Node

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Storage Node Storage

Node

Storage Node Storage

Node

Storage Node

Load Load LoadLoad

1 111

32

Writing Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage111 1

Load Load LoadLoad

33

Writing Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage111 1

Load Load LoadLoad

22 2 2

222 2

34

Writing Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage111 1

Load Load LoadLoad

22 2 233

33

35

Writing Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage111 1

Load Load LoadLoad

22 2 23 43 4

44

36

Writing Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Load Load LoadLoad

37

Reading Streaming Data

Storage Node

Storage Node

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

SELECT x, f(y), f(z) … FROM … WHERE HashKeyCol = …

Storage Node

Storage Node Storage

Node

Storage Node Storage

Node

Storage Node

38

Reading Streaming Data

SELECT x, f(y), f(z) … FROM … WHERE HashKeyCol = …

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Cache

39

Reading Streaming Data

Result

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Cache

Processing

40

Reading Streaming Data

Storage Node

Storage Node

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

SELECT x, f(y), f(z) … FROM … WHERE Time BETWEEN … GROUP BY x, ORDER BY x

Storage Node

Storage Node Storage

Node

Storage Node Storage

Node

Storage Node

41

Reading Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

SELECT x, f(y), f(z) … FROM … WHERE Time BETWEEN … GROUP BY x, ORDER BY x

42

Reading Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Processing Processing Processing Processing

SELECT x, f(y), f(z) … FROM … WHERE Time BETWEEN … GROUP BY x, ORDER BY x

43

Reading Streaming Data

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Result

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Processing Processing Processing Processing

Processing

44

Reading Streaming Data - under the hood

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Processing

SELECT x, f(y), f(z) … FROM … WHERE Time BETWEEN … GROUP BY x, ORDER BY x

StreamingTable

t0

t1

t2

t3

tn

WorkingThreads

Processing

Processing

Processing

Processing

45

Adding a Volume

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Load Load LoadLoad

Storage Node

Cache

Storage

Storage Node

Cache

Storage

46

Adding a Volume

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Load Load LoadLoad

Storage Node

Cache

Storage

Storage Node

Cache

Storage

47

Shrinking the Cluster

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Load Load LoadLoad

Storage Node

Cache

Storage

Storage Node

Cache

Storage

48

Replacing a Node

Storage Node

Cache

Storage

Storage Node

Cache

Storage

LockManager

DatabaseNode

DatabaseNode

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Load Load LoadLoad

Storage Node

Cache

Storage

Storage Node

Cache

Storage

Storage Node

Cache

Storage

49

Deleting Data in Streaming Tables SELECT count(*)

FROM payment_scaledb_streaming;

+----------+| count(*) |+----------+| 49152 |+----------+1 row in set (0.00 sec)

SELECT create_time, count(*) FROM payment_scaledb_streaming GROUP BY create_time;+---------------------+----------+| create_time | count(*) |+---------------------+----------+| 2016-01-01 00:00:00 | 1 || 2016-01-01 00:00:01 | 1 || 2016-01-01 00:00:02 | 1 || ... | ... || 2016-01-01 00:11:47 | 24576 |+---------------------+----------+16 rows in set (0.00 sec)

50

Deleting Data in Streaming Tables

DELETE FROM payment_scaledb_streamingWHERE create_time < '2016-01-01 00:00:02'

Query OK, 2 rows affected (0.01 sec)

SELECT count(*)FROM payment_scaledb_streaming;+----------+

| count(*) |+----------+| 49150 |+----------+

1 row in set (0.00 sec)

51

Deleting Data in Streaming Tables

DELETE FROM payment_scaledb_streaming WHERE create_time < '2016-01-01 00:02:00';Query OK, 10012 rows affected (0.05 sec)

SELECT count(*) FROM payment_scaledb_streaming;+----------+| count(*) |+----------+| 39138 |+----------+1 row in set (0.00 sec)

52

Deleting Data in Streaming Tables

DELETE FROM payment_scaledb_streaming;Query OK, 39138 rows affected (0.05 sec)

53

Streaming Tables - Improvements

• Row Size (>8K Bytes)

• Data Types (JSON)

• Variable Size and LOBS

• ALTER TABLE

• UPDATEs & DELETEs

• Primary / Alternate Key

• Data Types in Range Keys

• Compression

• Re-balancing

54

How Do Streaming Tables Perform?

• c3.8xlarge on a 10G net• Up to:

• 1 Manager, 1 Reader node • 5 Write Workers on 5 DB nodes• 6 Storage nodes

55

How Do Streaming Tables Perform?

Demo!

Thank You!

Pictures courtesy of:1 - Jo Simon - https://www.flickr.com/photos/josimon/2229642788/in/photostream

12 - Unknown - http://www.peopleofar.com/wp-content/uploads/mount-ararat3.jpg

13 - Pierre Frey - http://www.summitpost.org/baintha-brakk-7285m-from-braldu-glacier-1-may-2009/585262

14 - Eduardo Navas - https://www.flickr.com/photos/n3gro87/15825870143/in/photostream/

16 - Hansueli Krapf - https://commons.wikimedia.org/wiki/File:Aerial_View_of_Damavand_26.11.2008_04-25-38.JPG