huawei kunpeng computing database solution

19
Huawei Kunpeng Computing Database Solution

Upload: others

Post on 17-Oct-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Huawei Kunpeng Computing Database Solution

Huawei Kunpeng Computing Database Solution

Page 2: Huawei Kunpeng Computing Database Solution

Contents

1

1. Database Overview

2. Trend Insight and Kunpeng Computing Database

Page 3: Huawei Kunpeng Computing Database Solution

2

Database Overview

Database is a data warehouse. A database can store hundreds of or even millions of data

records in certain rules. If data is stored randomly, the database query is inefficient.

A database management system, the core component of the database system, is used to

operate and manage the database, such as creating database objects, querying, adding,

modifying, and deleting data stored in the database, and managing database users and

permissions.

Page 4: Huawei Kunpeng Computing Database Solution

3

Relational Database VS Non-relational Database

Databases can be classified into:

Relational database: uses a relational model to organize data. A relational model is a two-

dimensional table model. A relational database is a data organization composed of two-

dimensional tables and their relationships. You can use the general SQL language to perform

complex queries among these tables.

The mainstream relational databases are Oracle, Microsoft SQL Server, DB2, MySQL, and PostgreSQL.

Non-relational database: stores data in key-value pairs whose structures are not fixed. Key-

value pairs can be flexibly added as required without involving inter-table relationship. Complex

query is not supported.

The mainstream non-relational databases are Redis, HBase, and MongoDB.

Page 5: Huawei Kunpeng Computing Database Solution

4

Mainstream Databases

Page 6: Huawei Kunpeng Computing Database Solution

5

Relational Database - ACID

In a relational database, information is stored in two-dimensional tables. A relational database

contains multiple two-dimensional tables that are associated with each other. A relational database

must comply with the ACID features.

Atomicity

A transaction is the smallest working unit in a relational database. All operations in a transaction occur

or do not occur together.

Consistency

The database integrity constraints are not damaged before and after a transaction starts.

Isolation

When multiple transactions are concurrently accessed, the transactions are isolated from each other. A

transaction should not affect the running of other transactions.

Durability

After a transaction is complete, changes made by the transaction to the database are permanently

stored in the database and are not rolled back.

Page 7: Huawei Kunpeng Computing Database Solution

6

Relational Database - Transaction Isolation Levels

In the enterprise concurrency, the most complex problems of transactions are caused by

transaction isolation. When multiple transactions are processed in parallel mode,

relational databases usually use locks to ensure transaction isolation at different levels.

Read phenomena

Dirty reads

– A dirty read occurs when a transaction is allowed to read the data that is not submitted by another

transaction. The data may be rolled back, which violates consistency.

Non-repeatable reads

– A non-repeatable read occurs when two identical queries within the scope of a transaction return

different data due to the commit of modifications made by other transactions during the

transaction.

Phantom reads

– A phantom read occurs when a transaction reads newly inserted data that has been committed by

another transaction.

Loss of updates

– When a transaction is canceled, the updated data submitted by other transactions is overwritten.

Page 8: Huawei Kunpeng Computing Database Solution

7

Relational Database - Transaction Isolation Levels

Transaction isolation levels are used to prevent the preceding issues. ANSI/ISO SQL

defines the standard isolation levels as follows:

Read uncommitted

It is the lowest isolation level. Dirty reads are allowed. As a result, one transaction may see not-yet-

committed changes made by other transactions.

Read committed

Only submitted data is read to avoid dirty reads. Repeated reads are not allowed.

It is the default isolation level of the Oracle database.

Repeatable reads

Dirty reads and non-repetitive reads are avoided. However, phantom reads may occur.

It is the default isolation level of the MySQL database.

Serializable

It is the highest isolation level. That is, dirty reads, non-repeatable reads, and phantom reads will not

occur.

Page 9: Huawei Kunpeng Computing Database Solution

8

Relational Database - Redo and Undo

The database uses logs to ensure the atomicity, consistency, and durability of

transactions. Database logs are classified into redo logs and undo logs.

Redo logs record database changes. A relational database uses write-ahead transaction logs

to ensure durability. Modifications made by transactions to the database are written to

transaction logs before being written to database files. When the database breaks down, redo

logs are checked first, and persistent operations are performed on data that is not persistent.

Undo logs are used to store values before data is modified. When data is modified, undo

information is generated for consistent read and rollback.

Both undo logs and redo logs can be reused.

Page 10: Huawei Kunpeng Computing Database Solution

9

Relational Database - Lock Mechanism

The lock mechanism is a key feature that distinguishes a database system from a file

system. It is used to manage concurrent access to shared resources.

There are two types of locks in the database: lock and latch

Locks are used to lock objects such as tables, pages, and rows in a database. The database

management system uses the lock mechanism to isolate transactions. When multiple

transactions update the same data in the database at the same time, only the transaction that

holds the lock can update the data. Other transactions must wait until the previous transaction

releases the lock, and then update the data.

The latch is a lightweight lock that requires a short lock time. In the InnoDB storage engine,

latches are classified into mutex and rwlock.

The implementation of each database lock varies.

Page 11: Huawei Kunpeng Computing Database Solution

10

Database Service Scenarios - OLTP and OLAP

Data processing is classified into:

Online transaction processing (OLTP): a transaction-oriented processing system. It processes

small transactions and queries and has quickly response to user operations.

It processes small data volume and small transactions in real-time.

It has high requirements on the database memory hit ratio, concurrent operations, and disk

I/O latency.

Online analytical processing (OLAP): also called the Decision Support System (DSS). It

analyzes current and historical data of users, queries data, and generates reports to support

management and decision-making.

It processes a large amount of data and complex query, and is not time sensitive.

It emphasizes the SQL execution duration and disk I/O bandwidth.

Page 12: Huawei Kunpeng Computing Database Solution

11

OLTP VS OLAP

Type OLTP OLAP

Commercial databasesOutside China: Oracle, DB2, and SQL Server

China: Open Gauss, OceanBase, and GBase 8t

Outside China: Oracle (Exadata),

TeraData, Greenplum, and SAP

HANA

China: GBase 8a, Dameng, and

Gauss

Open-source databases MySQL, MariaDB, and PostgreSQLGreenplum (open-source edition)

Test criteria TPC-H TPC-C

Optimal storage modes Row store Column store

Tuning methods

• Improve memory hit ratio.

• Tune indexes.

• Accelerate disk access speed.

• Improve concurrency control.

• Tune partitioned table.

• Increase concurrency.

• Increase disk I/O bandwidth.

Page 13: Huawei Kunpeng Computing Database Solution

12

Row Store VS Column Store

Row Store Column Store

Logical storage unit Row data is the basic logical storage unit. Column data is a basic logical storage unit.

Write performance A row of data is written at a time.A row of data is split into a single column for storage and is

written for multiple times.

Read performanceA row of data is completely read. If several columns of data

are required, the redundant columns are read.

Each time a segment or all of a set is read. Therefore, there is no

redundancy problem.

Scenario

• Applicable to random data adding, deletion, modification,

and query operations.

• Frequent insertions or updates are involved.

Applicable to query or aggregation of a large amount of data.

Page 14: Huawei Kunpeng Computing Database Solution

13

Database Development Trends

Databases

Databases

Databases

AI autonomyAI-based acceleration of

database indexing,

query, O&M, and fault

prediction

Cloud data center

Multiple DCsMulti-active DCs, backup and DR (hybrid

cloud), and unified management and

scheduling are used to meet distributed

and HA requirements of services in

different regions.

Distributed

computingExponential growth of data volume,

vertical splitting of database services,

read/write separation, database/table

sharding, and distributed database

CloudElastic services, resource

sharing, cloud-based databases,

storage-compute decoupling,

vertical collaboration and tuning,

multi-tenant & QoS, and high

security and reliability

Multi-mode engineEmergence of new services (such as

IoT) and new scenarios (such as risk

control), and collaboration of multiple

database (such as Schema-less,

NewSQL, HTAP, Graph and TS)

Page 15: Huawei Kunpeng Computing Database Solution

14

Industry Challenges and Database Technology

Trends

Traditional databases have evolved from a standalone database to primary/standby databases and then to real application clusters (RACs). However, the performance scalability

of the RAC centralized architecture is limited. Distributed databases have become the mainstream to cope with a large number of concurrent requests.

(Note: In the database scenario, each thread processes 10 concurrent requests at the same time. A single RAC node can process a maximum of 1,000 concurrent requests. The

linearity of the RAC architecture with more than three nodes cannot be expanded.)

Alibaba OceanBase and Tencent TD-SQL, the two Internet companies, have developed vibrant distributed databases based on their own service support. The distributed TiDB of

Ping CAP is used to further explore the enterprise market.

Cloud-based databases are deployed in multiple modes, such as using multiple instances deployed on physical machines, using Dockers, and using VMs. However, this demands

shorter I/O latency for both network and storage.

Primary

database

1

User.

Traditional databases Database/Table

partitioning

Primary

database

Standby

database

Shared

storage

Primary

database

N

A(P2)

Standby

database

1

A(P1)

Standby

database

n

A(P2)

Distributed databases

Proxy routeGlobal Transaction

Management (GTM)

SQL

control

nodes

Global

clock

Database

nodes

Cluster

management

RAC1 RAC2

Local/

Shared

disk

SQLSQL SQL

SQL

Standby

database 1

Primary

database 1

Standby

database 2

Standby

database N

Primary

database N

...

...

...

...

Local/

Shared

disk

User databases User databases User databases

Page 16: Huawei Kunpeng Computing Database Solution

15

TaiShan Database Solution Ecosystem Planning

Internet

e-commerce and

public cloud

platforms

Government

e-government and

smart city

Finance

Data mining and risk

control

Carrier

Intelligent O&M,

intelligent operations

Large enterprise

Intelligent

manufacturing and

more

Infrastructure

TaiShan database ecosystem

Industry

applications

Database

platforms

Huawei-

developed

GaussDB

OLAP GaussDB

OLTP OpenGauss

Database partners

Kunpeng

processorsTaiShan

servers SSDsAtlas AI

accelerator cardsiNICs

Open-source

databasesDatabase partners

...

Adapted Adapted Adapted Future plans

Page 17: Huawei Kunpeng Computing Database Solution

16

Kunpeng Database Solution Advantages

OLAP

scenario

The multi-core processors

and multi-channel memory

apply to data analysis with

high I/O throughput and

large data volume. The

performance can be

improved by up to 20%.

OLTP

scenario

Open

ecosystem

In the OLTP multi-instance

deployment scenario, the

performance is improved

by 10% (the processor

needs to be specified for

each instance).

Supports mainstream open-

source databases. Supports mainstream China

home-made databases

* The preceding data is based on the comparison between TaiShan 200 server (2 x 920 5250) and the

x86 dual-socket server (2 x 6148) in Huawei labs.

Page 18: Huawei Kunpeng Computing Database Solution

17

Summary: Why Kunpeng Database?

• High performance:

Multi-instance and distributed deployment for OLTP applications; higher performance

over mainstream x86 configurations for OLAP applications

• Prosperous ecosystem:

Supports mainstream open-source software and China home-made commercial

software.

Page 19: Huawei Kunpeng Computing Database Solution

Copyright©2021 Huawei Technologies Co., Ltd. All Rights Reserved.

The information in this document may contain predictive statements including, without

limitation, statements regarding the future financial and operating results, future product

portfolio, new technology, etc. There are a number of factors that could cause actual

results and developments to differ materially from those expressed or implied in the

predictive statements. Therefore, such information is provided for reference purpose

only and constitutes neither an offer nor an acceptance. Huawei may change the

information at any time without notice.

Thank You.

Copyright©2021 Huawei Technologies Co., Ltd. All Rights Reserved.

The information in this document may contain predictive statements including, without

limitation, statements regarding the future financial and operating results, future product

portfolio, new technology, etc. There are a number of factors that could cause actual

results and developments to differ materially from those expressed or implied in the

predictive statements. Therefore, such information is provided for reference purpose

only and constitutes neither an offer nor an acceptance. Huawei may change the

information at any time without notice.

Thank You.