Huawei Kunpeng Computing Database Solution
Contents
1
1. Database Overview
2. Trend Insight and Kunpeng Computing Database
2
Database Overview
Database is a data warehouse. A database can store hundreds of or even millions of data
records in certain rules. If data is stored randomly, the database query is inefficient.
A database management system, the core component of the database system, is used to
operate and manage the database, such as creating database objects, querying, adding,
modifying, and deleting data stored in the database, and managing database users and
permissions.
3
Relational Database VS Non-relational Database
Databases can be classified into:
Relational database: uses a relational model to organize data. A relational model is a two-
dimensional table model. A relational database is a data organization composed of two-
dimensional tables and their relationships. You can use the general SQL language to perform
complex queries among these tables.
The mainstream relational databases are Oracle, Microsoft SQL Server, DB2, MySQL, and PostgreSQL.
Non-relational database: stores data in key-value pairs whose structures are not fixed. Key-
value pairs can be flexibly added as required without involving inter-table relationship. Complex
query is not supported.
The mainstream non-relational databases are Redis, HBase, and MongoDB.
4
Mainstream Databases
5
Relational Database - ACID
In a relational database, information is stored in two-dimensional tables. A relational database
contains multiple two-dimensional tables that are associated with each other. A relational database
must comply with the ACID features.
Atomicity
A transaction is the smallest working unit in a relational database. All operations in a transaction occur
or do not occur together.
Consistency
The database integrity constraints are not damaged before and after a transaction starts.
Isolation
When multiple transactions are concurrently accessed, the transactions are isolated from each other. A
transaction should not affect the running of other transactions.
Durability
After a transaction is complete, changes made by the transaction to the database are permanently
stored in the database and are not rolled back.
6
Relational Database - Transaction Isolation Levels
In the enterprise concurrency, the most complex problems of transactions are caused by
transaction isolation. When multiple transactions are processed in parallel mode,
relational databases usually use locks to ensure transaction isolation at different levels.
Read phenomena
Dirty reads
– A dirty read occurs when a transaction is allowed to read the data that is not submitted by another
transaction. The data may be rolled back, which violates consistency.
Non-repeatable reads
– A non-repeatable read occurs when two identical queries within the scope of a transaction return
different data due to the commit of modifications made by other transactions during the
transaction.
Phantom reads
– A phantom read occurs when a transaction reads newly inserted data that has been committed by
another transaction.
Loss of updates
– When a transaction is canceled, the updated data submitted by other transactions is overwritten.
7
Relational Database - Transaction Isolation Levels
Transaction isolation levels are used to prevent the preceding issues. ANSI/ISO SQL
defines the standard isolation levels as follows:
Read uncommitted
It is the lowest isolation level. Dirty reads are allowed. As a result, one transaction may see not-yet-
committed changes made by other transactions.
Read committed
Only submitted data is read to avoid dirty reads. Repeated reads are not allowed.
It is the default isolation level of the Oracle database.
Repeatable reads
Dirty reads and non-repetitive reads are avoided. However, phantom reads may occur.
It is the default isolation level of the MySQL database.
Serializable
It is the highest isolation level. That is, dirty reads, non-repeatable reads, and phantom reads will not
occur.
8
Relational Database - Redo and Undo
The database uses logs to ensure the atomicity, consistency, and durability of
transactions. Database logs are classified into redo logs and undo logs.
Redo logs record database changes. A relational database uses write-ahead transaction logs
to ensure durability. Modifications made by transactions to the database are written to
transaction logs before being written to database files. When the database breaks down, redo
logs are checked first, and persistent operations are performed on data that is not persistent.
Undo logs are used to store values before data is modified. When data is modified, undo
information is generated for consistent read and rollback.
Both undo logs and redo logs can be reused.
9
Relational Database - Lock Mechanism
The lock mechanism is a key feature that distinguishes a database system from a file
system. It is used to manage concurrent access to shared resources.
There are two types of locks in the database: lock and latch
Locks are used to lock objects such as tables, pages, and rows in a database. The database
management system uses the lock mechanism to isolate transactions. When multiple
transactions update the same data in the database at the same time, only the transaction that
holds the lock can update the data. Other transactions must wait until the previous transaction
releases the lock, and then update the data.
The latch is a lightweight lock that requires a short lock time. In the InnoDB storage engine,
latches are classified into mutex and rwlock.
The implementation of each database lock varies.
10
Database Service Scenarios - OLTP and OLAP
Data processing is classified into:
Online transaction processing (OLTP): a transaction-oriented processing system. It processes
small transactions and queries and has quickly response to user operations.
It processes small data volume and small transactions in real-time.
It has high requirements on the database memory hit ratio, concurrent operations, and disk
I/O latency.
Online analytical processing (OLAP): also called the Decision Support System (DSS). It
analyzes current and historical data of users, queries data, and generates reports to support
management and decision-making.
It processes a large amount of data and complex query, and is not time sensitive.
It emphasizes the SQL execution duration and disk I/O bandwidth.
11
OLTP VS OLAP
Type OLTP OLAP
Commercial databasesOutside China: Oracle, DB2, and SQL Server
China: Open Gauss, OceanBase, and GBase 8t
Outside China: Oracle (Exadata),
TeraData, Greenplum, and SAP
HANA
China: GBase 8a, Dameng, and
Gauss
Open-source databases MySQL, MariaDB, and PostgreSQLGreenplum (open-source edition)
Test criteria TPC-H TPC-C
Optimal storage modes Row store Column store
Tuning methods
• Improve memory hit ratio.
• Tune indexes.
• Accelerate disk access speed.
• Improve concurrency control.
• Tune partitioned table.
• Increase concurrency.
• Increase disk I/O bandwidth.
12
Row Store VS Column Store
Row Store Column Store
Logical storage unit Row data is the basic logical storage unit. Column data is a basic logical storage unit.
Write performance A row of data is written at a time.A row of data is split into a single column for storage and is
written for multiple times.
Read performanceA row of data is completely read. If several columns of data
are required, the redundant columns are read.
Each time a segment or all of a set is read. Therefore, there is no
redundancy problem.
Scenario
• Applicable to random data adding, deletion, modification,
and query operations.
• Frequent insertions or updates are involved.
Applicable to query or aggregation of a large amount of data.
13
Database Development Trends
Databases
Databases
Databases
AI autonomyAI-based acceleration of
database indexing,
query, O&M, and fault
prediction
Cloud data center
Multiple DCsMulti-active DCs, backup and DR (hybrid
cloud), and unified management and
scheduling are used to meet distributed
and HA requirements of services in
different regions.
Distributed
computingExponential growth of data volume,
vertical splitting of database services,
read/write separation, database/table
sharding, and distributed database
CloudElastic services, resource
sharing, cloud-based databases,
storage-compute decoupling,
vertical collaboration and tuning,
multi-tenant & QoS, and high
security and reliability
Multi-mode engineEmergence of new services (such as
IoT) and new scenarios (such as risk
control), and collaboration of multiple
database (such as Schema-less,
NewSQL, HTAP, Graph and TS)
14
Industry Challenges and Database Technology
Trends
Traditional databases have evolved from a standalone database to primary/standby databases and then to real application clusters (RACs). However, the performance scalability
of the RAC centralized architecture is limited. Distributed databases have become the mainstream to cope with a large number of concurrent requests.
(Note: In the database scenario, each thread processes 10 concurrent requests at the same time. A single RAC node can process a maximum of 1,000 concurrent requests. The
linearity of the RAC architecture with more than three nodes cannot be expanded.)
Alibaba OceanBase and Tencent TD-SQL, the two Internet companies, have developed vibrant distributed databases based on their own service support. The distributed TiDB of
Ping CAP is used to further explore the enterprise market.
Cloud-based databases are deployed in multiple modes, such as using multiple instances deployed on physical machines, using Dockers, and using VMs. However, this demands
shorter I/O latency for both network and storage.
Primary
database
1
User.
Traditional databases Database/Table
partitioning
Primary
database
Standby
database
Shared
storage
Primary
database
N
A(P2)
Standby
database
1
A(P1)
Standby
database
n
A(P2)
Distributed databases
Proxy routeGlobal Transaction
Management (GTM)
SQL
control
nodes
Global
clock
Database
nodes
Cluster
management
RAC1 RAC2
Local/
Shared
disk
SQLSQL SQL
SQL
Standby
database 1
Primary
database 1
Standby
database 2
Standby
database N
Primary
database N
...
...
...
...
Local/
Shared
disk
User databases User databases User databases
15
TaiShan Database Solution Ecosystem Planning
Internet
e-commerce and
public cloud
platforms
Government
e-government and
smart city
Finance
Data mining and risk
control
Carrier
Intelligent O&M,
intelligent operations
Large enterprise
Intelligent
manufacturing and
more
Infrastructure
TaiShan database ecosystem
Industry
applications
Database
platforms
Huawei-
developed
GaussDB
OLAP GaussDB
OLTP OpenGauss
Database partners
Kunpeng
processorsTaiShan
servers SSDsAtlas AI
accelerator cardsiNICs
Open-source
databasesDatabase partners
...
Adapted Adapted Adapted Future plans
16
Kunpeng Database Solution Advantages
OLAP
scenario
The multi-core processors
and multi-channel memory
apply to data analysis with
high I/O throughput and
large data volume. The
performance can be
improved by up to 20%.
OLTP
scenario
Open
ecosystem
In the OLTP multi-instance
deployment scenario, the
performance is improved
by 10% (the processor
needs to be specified for
each instance).
Supports mainstream open-
source databases. Supports mainstream China
home-made databases
* The preceding data is based on the comparison between TaiShan 200 server (2 x 920 5250) and the
x86 dual-socket server (2 x 6148) in Huawei labs.
17
Summary: Why Kunpeng Database?
• High performance:
Multi-instance and distributed deployment for OLTP applications; higher performance
over mainstream x86 configurations for OLAP applications
• Prosperous ecosystem:
Supports mainstream open-source software and China home-made commercial
software.
Copyright©2021 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without
limitation, statements regarding the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that could cause actual
results and developments to differ materially from those expressed or implied in the
predictive statements. Therefore, such information is provided for reference purpose
only and constitutes neither an offer nor an acceptance. Huawei may change the
information at any time without notice.
Thank You.
Copyright©2021 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without
limitation, statements regarding the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that could cause actual
results and developments to differ materially from those expressed or implied in the
predictive statements. Therefore, such information is provided for reference purpose
only and constitutes neither an offer nor an acceptance. Huawei may change the
information at any time without notice.
Thank You.