打破关系数据库的局限性

数据库技术的发展和转型正在上升。
NewSQL已经出现，可以将各种技术结合起来，并且通过这些技术的结合实现的核心功能促进了云原生数据库的发展。

本文提供了对三种类型的NewSQL中云本地数据库技术的深入了解。
新的体系结构和“数据库即服务”类型涉及许多与数据库相关的基础实现，因此在此不再赘述。
本文重点介绍透明分片中间件的核心功能和实现原理。
其他两种NewSQL类型的核心功能类似于分片中间件，但具有不同的实现原理。

关于性能和可用性，以集中方式将数据存储在单个数据节点上的传统解决方案不再适应Internet创建的海量数据场景。
大多数关系数据库产品使用B +树索引。
当数据量超过阈值时，索引深度的增加将导致磁盘I / O计数增加，从而大大降低查询性能。
此外，高度并发的访问请求也使集中式数据库成为系统的最大瓶颈。

由于传统的关系数据库无法满足Internet的要求，因此进行了越来越多的尝试来将数据存储在本机支持数据分发的NoSQL数据库中。
但是，NoSQL与SQL Server不兼容，并且其生态系统仍有待改进。
因此，NoSQL无法替换关系数据库，并且关系数据库的位置是安全的。

分片是指将存储在单个数据库中的数据基于某个维度分配到多个数据库或表中，以提高整体性能和可用性。
有效的分片措施包括关系数据库的数据库分片和表分片。
两种分片方法都可以有效地防止由于超出阈值的庞大数据量而导致的查询瓶颈。

此外，数据库分片可以有效地分配单个数据库的访问请求，而表分片可以在可能的情况下将分布式事务转换为本地事务。
多主从分片方法可以有效地防止单点数据的发生，提高数据架构的可用性。

垂直分片也称为垂直分区。
它的关键思想是将不同的数据库用于不同的目的。
在执行分片之前，数据库可以由对应于不同业务的多个数据表组成。
进行分片后，这些表将根据业务进行组织并分配到不同的数据库，从而平衡了不同数据库之间的工作量，如下所示：

垂直分片

水平分片也称为水平分区。
与垂直分片相反，水平分片不按业务逻辑组织数据。
而是根据特定字段的规则将数据分配到多个数据库或表，并且每个分片仅包含部分数据。

例如，如果ID mod 10的最后一位为0，则此ID存储到数据库（表）0中；如果ID mod 10的最后一位为1，则此ID存储到数据库（表）1中，如下所示：

水平分片

分片是解决由海量数据引起的关系数据库性能问题的有效解决方案。

在此解决方案中，单个节点上的数据被拆分并存储到多个数据库或表中，即数据被分片。
数据库分片可以有效地分散由高度并发访问尝试引起的数据库负载。
尽管表分片不能减轻数据库的负载，但是您仍然可以使用数据库本机ACID事务来跨表分片进行更新。
一旦涉及跨数据库更新，分布式事务的问题就变得极为复杂。

数据库分片和表分片可确保每个表的数据量始终低于阈值。
垂直分片通常需要调整体系结构和设计，因此，无法满足Internet上快速变化的业务需求。
因此，它不能有效消除单点瓶颈。
从理论上讲，水平分片消除了单个主机数据处理的瓶颈，并支持灵活的扩展，使其成为标准分片解决方案。

数据库分片和读/写分离是用于大量访问流量的两种常用措施。
尽管表分片可以解决由海量数据引起的性能问题，但不能解决由对同一数据库的过多请求导致的响应速度慢的问题。因此，数据库分片通常以水平分片的方式实现，以处理庞大的数据量和繁重的访问流量。
读/写分离是另一种分配流量的方法。
但是，在设计体系结构时，必须考虑数据读取和数据写入之间的等待时间。

尽管数据库分片可以解决这些问题，但是分布式体系结构引入了新问题。
由于在数据库分片或表分片之后数据分散很广，因此在对数据库执行操作时，应用程序开发和运维人员必须面对极其繁重的工作量。
例如，他们需要了解每种数据的特定表分片和主数据库。

带有全新体系结构的NewSQL以与分片中间件不同的方式解决了此问题：

在具有新体系结构的NewSQL中，数据库存储引擎经过了重新设计，可以将来自同一表的数据存储在分布式文件系统中。
在分片中间件中，分片的影响对用户是透明的，从而使他们可以将水平分片的数据库用作通用数据库。

跨数据库事务对分布式数据库提出了巨大挑战。
使用适当的表分片，可以减少每个表中存储的数据量，并尽可能使用本地事务。
正确使用同一数据库中的不同表可以有效地避免由分布式事务引起的问题。
但是，在不可避免的跨数据库交易的情况下，某些企业仍要求交易保持一致。
另一方面，由于性能差，Internet公司拒绝基于XA的分布式事务。
相反，这些公司中的大多数使用软交易来确保最终的一致性。

由于系统访问流量的增加，数据库吞吐量受到巨大瓶颈的挑战。
对于具有大量并发读取和少量写入的应用程序，可以将单个数据库拆分为主数据库和辅助数据库。
主数据库用于事务的添加，删除和修改，而辅助数据库用于查询。
这有效地防止了由数据更新引起的行锁定问题，并显着提高了整个系统的查询性能。

如果配置一个主数据库和多个辅助数据库，则查询请求可以平均分配到多个数据副本，从而进一步增强系统的处理能力。

如果配置多个主数据库和多个辅助数据库，则可以提高系统的吞吐量和可用性。
在这种配置下，当这些数据库之一关闭或物理损坏磁盘时，系统仍可以正常运行。

读/写分离本质上是分片的一种。
在水平分片中，数据被分散到不同的数据节点。
但是，在读/写分离中，基于SQL语法分析的结果，读写请求分别路由到主数据库和辅助数据库。
值得注意的是，不同数据节点上的数据在读/写分离上是一致的，但在水平分片上是不同的。
通过将水平分片与读/写分离结合使用，可以进一步提高系统性能，但是系统维护变得复杂。

尽管读/写分离可以提高系统的吞吐量和可用性，但它还会导致多个主数据库之间以及主数据库和辅助数据库之间的数据不一致。
此外，类似于分片，读/写分离还增加了应用程序开发和运维人员的数据库运维复杂性。

作为读/写分离的主要好处，读/写分离的影响对用户是透明的，从而使他们可以将主数据库和辅助数据库用作公用数据库。

分片包括以下过程：语句解析，语句路由，语句修改，语句执行和结果聚合。
数据库协议适配对于确保原始应用程序进行低成本访问至关重要。

除SQL外，NewSQL还与传统关系数据库的协议兼容，从而降低了用户的访问成本。
开源关系数据库产品通过实施NewSQL协议充当本机关系数据库。

由于MySQL和PostgreSQL的流行，许多NewSQL数据库实现了MySQL和PostgreSQL的传输协议，从而使MySQL和PostgreSQL用户可以访问NewSQL产品而无需修改其业务代码。

MySQL协议

当前，MySQL是最流行的开源数据库产品。
要了解其协议，您可以从MySQL的基本数据类型，协议包结构，连接阶段和命令阶段开始。

基本数据类型：

MySQL封包包含MySQL定义的以下基本数据类型：

Basic MySQL data types

When binary data needs to be converted to the data that can be understood by MySQL, the MySQL packet is read based on the number of digits pre-defined by the data type and converted to the corresponding number or string. In turn, MySQL writes each field to the packet according to the length specified in the protocol.

Structure of a MySQL Packet

The MySQL protocol consists of one or more MySQL packets. Regardless of the type, a MySQL packet consists of the payload length, sequence ID, and payload.

The payload length is of the int<3> type. It indicates the total number of bytes occupied by the subsequent payload. Note that the payload length does not include the length of the sequence ID.
The sequence ID is of the int<1> type. It indicates the serial number of each MySQL packet returned for a request. The maximum sequence ID that occupies one byte is 0xff, that is, 255 in decimal notation. However,syncnavigator this does not imply that a request can only contain up to 255 MySQL packets. If the sequence ID exceeds 255, the sequence ID restarts from zero. For example, hundreds of thousands of records may be returned for a request. In this case, the MySQL packets only need to ensure that their sequence IDs are continuous. If the sequence ID exceeds 255, it is reset and restarts from zero.
The length of the payload is the bytes declared by the payload length. In a MySQL packet, the payload is the actual business data. The content of the payload varies with the packet type.

Connection Phase

In the connection phase, a communication channel is established between the MySQL client and server. Then, three tasks are completed in this phase: exchanging the capabilities of the MySQL client and server (Capability Negotiation), setting up an SSL communication channel, and authenticating the client against the server. The following figure shows the connection setup flow from the MySQL server perspective:

Flowchart of the MySQL connection phase

The figure excludes the interaction between the MySQL server and client. In fact, MySQL connection is initiated by the client. When the MySQL server receives a connection request from the client, it exchanges the capabilities of the server and client, generates the initial handshake packet in different formats based on the negotiation result, and writes the packet to the client. The packet contains the connection ID, server’s capabilities, and ciphertext generated for authorization.

After receiving the handshake packet from the server, the MySQL client sends a handshake packet response. This packet contains the user name and encrypted password for accessing the database.

After receiving the handshake response, the MySQL server verifies the authentication information and returns the verification result to the client.

Command Phase

The command phase comes after the successful connection phase. In this phase, commands are executed. MySQL has a total of 32 command packets, whose specific types are listed below:

MySQL command packets

MySQL command packets are classified into four types: text protocol, binary protocol, stored procedure, and replication protocol.

The first bit of the payload is used to identify the command type. The functions of packets are indicated by their names. 下面介绍一些重要的MySQL命令包：

COM_QUERY

COM_QUERY是MySQL用于以纯文本格式进行查询的重要命令。
它对应于java。
sql。
JDBC中的语句。
COM_QUERY本身很简单，由ID和SQL语句组成：

1 [03] COM_QUERY

string[EOF] is the query the server will execute

COM_QUERY响应数据包很复杂，如下所示：

MySQL COM_QUERY流程图

根据情况，可能会返回四种类型的COM_QUERY响应。
这些是查询结果，更新结果，文件执行结果和错误。

如果在执行期间发生错误，例如网络断开连接或错误的SQL语法，则MySQL协议会将数据包的第一位设置为0xff，并将错误消息封装到ErrPacket中并返回。

鉴于很少有文件用于执行COM_QUERY，因此在此不再赘述。

对于更新请求，MySQL协议将数据包的第一位设置为0x00并返回OkPacket。
OkPacket必须包含受此更新操作影响的行记录数和最后插入的ID。

查询请求最复杂。
对于此类请求，必须基于客户端通过读取int获得的结果集字段的数量来创建独立的FIELD_COUNT数据包。
然后，根据返回字段的每一列的详细信息依次生成独立的COLUMN_DEFINITION数据包。
查询字段的元数据信息以EofPacket结尾。
以后，数据包的文本协议结果集行将逐行生成，并转换为字符串格式，而不考虑数据类型。
最后，数据包仍然以EofPacket结尾。

Java。
sql。
JDBC中的PreparedStatement操作由以下五个MySQL二进制协议数据包组成：COM_STMT_PREPARE
，COM_STMT_EXECUTE，COM_STMT_ CLOSE，COM_STMT_RESET和COM_ STMT_SEND_LONG_DATA。
在这些数据包中，COM_STMT_PREPARE和COM_STMT_ EXECUTE最重要。
它们对应于连接。
prepareStatement（）和连接。
execute（）＆连接。
executeQuery（）＆connection。
分别在JDBC中的executeUpdate（）。

COM_STMT_PREPARE

COM_STMT_PREPARE与COM_QUERY相似，两者均由命令ID和特定的SQL语句组成：

1 [16] COM_STMT_PREPARE

string[EOF] the query to prepare

返回的COM_STMT_PREPARE值不是查询结果，而是由statement_id，列数和参数数组成的响应包。
Statement_id是预编译完成后MySQL分配给SQL语句的唯一标识符。
基于statement_id，您可以从MySQL中检索相应的SQL语句。

对于通过COM_STMT_PREPARE命令注册的SQL语句，只需将statement_id（而不是SQL语句本身）发送到COM_STMT_EXECUTE命令，从而消除了不必要的网络带宽消耗。

Moreover, MySQL can pre-compile the SQL statements passed in by COM_STMT_PREPARE into the abstract syntax tree for reuse, improving SQL execution efficiency. If COM_QUERY is used to execute the SQL statements, you must re-compile each of these statements. For this reason, PreparedStatement is more efficient than Statement.

COM_STMT_EXECUTE

COM_STMT_EXECUTE consists of the statement-id and the parameters for the SQL. It uses a data structure named NULL-bitmap to identify the null values of these parameters.

The response packet of the COM_STMT_EXECUTE command is similar to that of the COM_QUERY command. For both response packets, the field metadata and query result set are returned and separated by the EofPacket.

Their differences lie in that Text Protocol Resultset Row is replaced with Binary Protocol Resultset Row in the COM_STMT_EXECUTE response packet. Based on the type of the returned data, the format of the returned data is converted to the corresponding MySQL basic data type, further reducing the required network transfer bandwidth.

Other Protocols

In addition to MySQL, PostgreSQL, and SQL Server are also open-source protocols and can be implemented in the same way. In contrast, another frequently used database protocol, Oracle, is not open source and cannot be implemented in the same way.

Although SQL is relatively simple compared to other programming languages, it is still a complete programming language. Therefore, it essentially works in the same way as other languages in terms of parsing SQL grammar and parsing other languages (such as Java, C, and Go).

The parsing process is divided into lexical parsing and syntactic parsing. First, the lexical parser splits the SQL statement into words that cannot be further divided. Then, the syntactic parser converts the SQL statement to an abstract syntax tree. Finally, the abstract syntax tree is accessed to extract the parsing context.

The parsing context includes tables, Select items, Order By items, Group By items, aggregate functions, pagination information, and query conditions. For a NewSQL statement of the sharding middleware type, the placeholders that may be changed are also included.

By using the following SQL statement as an example: select username, ismale from userinfo where age > 20 and level > 5 and 1=1, the post-parsing abstract syntax tree is as follows:

Abstract syntax tree

Many third-party tools can be used to generate abstract syntax trees, among which ANTLR is a good choice. ANTLR generates Java code for the abstract syntax tree based on the rules defined by developers and provides a visitor interface. Compared with code generation, the manually developed abstract syntax tree is more efficient in execution but the workload is relatively high. In scenarios where performance requirements are demanding, you can consider customizing the abstract syntax tree.

The sharding strategy is to match databases and tables according to the parsing context and generate the routing path. SQL routing with sharding keys can be divided into single-shard routing (where the equal mark is used as the sharding operator), multi-shard routing (where IN is used as the sharding operator), and range routing (where BETWEEN is used as the sharding operator). SQL statements without sharding keys adopt broadcast routing.

Normally, sharding policies can be incorporated by the database or be configured by users. Sharding policies incorporated in the database are relatively simple and can generally be divided into mantissa modulo, hash, range, tag, time, and so on. More flexible, sharding policies set by users can be customized according to their needs.

NewSQL with the new architecture does not require SQL statement rewriting, which is only required for NewSQL statements of the sharding middleware type. SQL statement rewriting is used to rewrite SQL statements into ones that can be correctly executed in actual databases. This includes replacing the logical table name with the actual table name, rewriting the start and end values of the pagination information, adding the columns that are used for sorting, grouping, and auto-increment keys, and rewriting AVG as SUM or COUNT.

Results merging refers to merging multiple execution result sets into one result set and returning it to the application. Results merging is divided into stream merging and memory merging.

Stream merging is used for simple queries, Order By queries, Group By queries, and Order By and Group By scenarios where the Order By and Group By items are completely consistent. The “next” method is called each time to traverse the stream merging result set without consuming additional memory resources.
内存合并要求将结果集中的所有数据都必须加载到内存中进行处理。
如果结果集包含大量数据，则会相应消耗大量内存资源。

在本文的第2部分中，我们将进一步详细讨论分布式事务和数据库治理。