Disclosure of Invention
The invention aims to provide a method for controlling consistency of non-relational data after random sequence reading and writing; the control method carries out consistency protection on the database of the non-relational data through the characteristics of decentralization, non-falsification and strong consistency of the block chain, and only establishes a mapping relation between the key and the data block by utilizing the storage characteristics of the non-relational data, thereby saving the system calculation power and time consumption in the process of verifying the consistency and improving the operation efficiency of a distributed system when the control method is realized.
The invention adopts the following technical scheme:
a method for controlling consistency of non-relational data after random sequence reading and writing comprises the steps of setting a routing site; the routing site is served by at least one node on a distributed system; the routing site is used for responding to a read/write request of a client to the database and counting the load capacity and the transaction processing progress of each node on the current distributed system; the control method also comprises the steps that at least n main sites are set, the main sites are used for responding to the writing operation, and a monitoring module is adopted for monitoring whether a writing request for the current data block to be written exists on the distributed system or not while the data is written; the method for controlling consistency further comprises the steps of setting at least m slave stations; the slave site only has read-only permission, and the slave site only allows the reading of the data block after the master site responds to the reading requirement;
the distributed system internally maintains an index main chain for storing a non-relational database key index directory; the index main chain is established in a form of a federation chain, and any node outside the distributed system is refused to participate in any operation of the index main chain; each block of the index main chain comprises address block information written into the non-relational database, wherein the key-value pairs < key-value > are located in the non-relational database; when the write operation record of the non-relational database reaches a specified threshold, the index main chain requires a full-chain node to complete the write packaging operation of the previous block, and a new block is created within a specified time threshold; the routing site, the master site and the slave site verify the index record of the last block of the index main chain; the index record comprises the address position of a key storage data block in a non-relational database, and the record information of the index main chain is used as the record basis of a final legal data block;
the routing site carries out election determination on candidate nodes on the chain by the full chain of the index main chain; the selection of the routing station is carried out periodically; before the routing sites are elected, the candidate nodes complete at least one response test of multiple concurrent tasks, the candidate nodes are subjected to concurrency performance sorting, and at least one candidate site is selected to become the routing site;
the routing site responds to a read/write request for the non-relational database from the outside through a program interface, and distinguishes the read/write request; dispatching the read request to the slave station for processing; forcibly writing the log records into the write requests, and giving a request time stamp to each write request;
the master site authenticates and determines the candidate nodes on the chain by the full chain of the index main chain; the selection of the main station is carried out periodically; before the main site is elected, completing at least one write test by the candidate node; the write test at least comprises performance investigation of continuous write, random write and write delay of the candidate nodes; the index main chain refers to the writing performance of a plurality of candidate nodes and selects at least n candidate nodes as main sites according to the writing requirements of the non-relational database;
the main site performs writing operation on the key value pair < key-value > of the non-relational data and establishes a data block address corresponding to the position by a multi-level cache writing method; the master station marks the writing state of the data block address, so that the writing state of the data block address at least comprises a writing waiting state, a writing-in state or a writing refusing state;
the slave station responding to the read action dispatched by the routing station; before the slave station reads the data block address, reading the writing state of the data block address, and only reading the key-value pair < key-value > in the data block address, wherein the writing state is a waiting writing state or a refusing writing state;
a plurality of the master stations monitor the number of completed write operations; when the number of the write operations exceeds a threshold value, traversing the log records by the plurality of master stations and carrying out re-screening operation on keys in the key-value recorded in all the data block addresses subjected to the write operations; for different address blocks with completely same key value pair < key-value >, a plurality of main sites carry out common identification certification to confirm one address block as a legal address block, and establish the association of the key value pair-address; for more than two key value pairs < key-value > with the same key but different values, verifying the legality of the last record of the key in the log record by a common identification mechanism through a plurality of main sites, selecting the last legal write operation, taking an address block of the last write operation of the key value pair < key-value > as a unique legal address block, and establishing the association of the key value pair-address block;
generating the key index directory after all the host sites confirm that the key-value pairs < key-value > have unique corresponding relations with the address blocks; the key index catalog is broadcasted to the index main chain, all-link points verify the key index catalog, and after the verification is finished, the routing site packs the key index catalog and writes the key index catalog into the last block of the index main chain; the routing site simultaneously carries out hash encryption operation on the last block of the index main chain so as to obtain a hash value with a fixed length; the indexing backbone generates a new block and the routing site writes the hash value of the previous block into the block header of the new block.
The beneficial effects obtained by the invention are as follows:
1. the control method comprehensively utilizes the respective characteristics of the block chain and the non-relational data, and distinguishes the prior database to put forward extremely high operation performance requirements on a system for operating the database so as to protect the strong consistency of the data, thereby realizing the cost-performance balance;
2. the control method utilizes the performance characteristics of a plurality of distributed operation nodes to properly distribute the read/write tasks, maximizes the utilization of the distributed total computing power and ensures the response speed of the non-relational database to the high concurrency characteristic;
3. the alliance chain form established by the control method isolates redundant nodes in the distributed system, avoids the possibility of illegal operation without joint points, reduces the change of the original distributed system, and can adapt to the smooth transition of the arrangement form of the distributed system.
4. The control method is suitable for programming systems, languages or algorithms based on various non-relational databases, and has good universality effect.
Detailed Description
In order to make the technical solution and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the embodiments thereof; it should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. Other systems, methods, and/or features of the present embodiments will become apparent to those skilled in the art upon review of the following detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims. Additional features of the disclosed embodiments are described in, and will be apparent from, the detailed description that follows.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it is to be understood that if there is an orientation or positional relationship indicated by the terms "upper", "lower", "left", "right", etc. based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not intended to indicate or imply that the device or assembly referred to must have a specific orientation.
The first embodiment is as follows:
referring to fig. 1, a method for controlling consistency of non-relational data after being read and written by random sequences; the control method comprises the steps of setting a routing station; the routing site is served by at least one node on a distributed system; the routing site is used for responding to a read/write request of a client to the database and counting the load capacity and the transaction processing progress of each node on the current distributed system; the control method also comprises the steps that at least n main sites are set, the main sites are used for responding to the writing operation, and a monitoring module is adopted for monitoring whether a writing request for the current data block to be written exists on the distributed system or not while the data is written; the method for controlling consistency further comprises the steps of setting at least m slave stations; the slave station only has read-only permission, and the slave station only allows the data block to be read after the master station responds to the read request;
the distributed system internally maintains an index main chain for storing a non-relational database key index directory; the index main chain is established in a form of a federation chain, and any node outside the distributed system is refused to participate in any operation of the index main chain; each block of the index main chain comprises address information of an address block in which all key-value pairs < key-value > are written in the non-relational database; when the write operation record of the non-relational database reaches a specified threshold, the index main chain requires a full-chain node to complete the write packaging operation of the previous block, and a new block is created within a specified time threshold; the routing site, the master site and the slave site verify the storage address block position of the key in the non-relational database according to the index record of the last block of the index main chain, and take the record information of the index main chain as the final legal data block record basis;
the routing site carries out election determination on candidate nodes on the chain by the full chain of the index main chain; the selection of the routing station is carried out periodically at fixed time; before the routing sites are elected, the candidate nodes complete at least one response test of multiple concurrent tasks, the candidate nodes are subjected to concurrency performance sorting, and at least one candidate site is selected to become the routing site;
the routing site responds to a read/write request for the non-relational database from the outside through a program interface, and distinguishes the read/write request; dispatching the read request to the slave station for processing; generating write records for the write requests, recording the write records into an operation log, and giving a request time stamp to each write request;
the master site authenticates and determines the candidate nodes on the chain by the full chain of the index main chain; the selection of the main station is carried out periodically; completing a write-in test by the candidate node before electing the master site; the write test at least comprises performance investigation of continuous write, random write and write delay of the candidate nodes; the index main chain refers to the writing performance of a plurality of candidate nodes and selects at least n candidate nodes as main sites according to the writing requirements of the non-relational database;
the main site performs writing operation on the key value pair < key-value > of the non-relational data and establishes a data block address corresponding to the position by a multi-level cache writing method; the master station marks the writing state of the data block address, so that the writing state of the data block address at least comprises a writing waiting state, a writing-in state or a writing refusing state;
the slave station responding to the read action dispatched by the routing station; before the slave station reads the data block address, reading the writing state of the data block address, and only reading the key-value pair < key-value > in the data block address, wherein the writing state is a waiting writing state or a refusing writing state;
a plurality of the master stations monitor the number of completed write operations; when the number of the write operations exceeds a threshold value, traversing the log records by the plurality of master stations and carrying out re-screening operation on keys in the key-value recorded in all the data block addresses subjected to the write operations; for different address blocks with completely same key value pair < key-value >, a plurality of main sites carry out common identification certification to confirm one address block as a legal address block, and establish the association of the key value pair-address; for more than two key value pairs < key-value > with the same key but different values, verifying the legality of the last record of the key in the log record by a common identification mechanism through a plurality of main sites, selecting the last legal write operation, taking an address block of the last write operation of the key value pair < key-value > as a unique legal address block, and establishing the association of the key value pair-address;
generating the key index directory after all the host sites confirm that the key-value pairs < key-value > have unique corresponding relations with the address blocks; the key index catalog is broadcasted to the index main chain, all-link points verify the key index catalog, and after the verification is finished, the routing site packs the key index catalog and writes the key index catalog into the last block of the index main chain; the routing site simultaneously carries out hash encryption operation on the last block of the index main chain so as to obtain a hash value with a fixed length; the index main chain generates a new block, and the routing station writes the hash value of the previous block into the block head of the new block;
in contrast to relational databases with strict table structures, in non-relational databases, data is stored in a large number of key-value pair < key-value > patterns; the key values have no strict requirement on the storage space position, and have no coupling with each other, and the value corresponding to the key is determined directly by searching the key in the process of indexing the non-relational data, so that further operation is performed; the characteristics have corresponding advantages for massive concurrent data requests; the relational database needs a large amount of processes of analyzing the row and column positions in the process of searching data, and is omitted on the non-relational database, so that the operation speed of the database is improved;
furthermore, for the read/write operation, the read/write speed of the mechanical hard disk is limited by the physical mechanical structure and the process, so that the breakthrough is difficult, and the read/write speed becomes a serious bottleneck of database reading; with the rapid development of the current semiconductor storage, the solid state disk with high read-write speed and low read-write delay gradually replaces the mechanical hard disk to become an important component of the storage space of the database; however, the mechanical hard disk can be stored when the power is off, and the cost of the unit storage space is low, so that the mechanical hard disk still has an important position in the database, and at present, the mechanical hard disk is greatly relied on as the bottom storage space of the database;
further, for a distributed system, there are different computer systems for distributed nodes; a large number of mechanical hard disks are operated in some nodes and can be used as a main force storage server in distribution system transportation and storage for deploying a database; some nodes are only provided with a small number of mechanical hard disks, even the mechanical hard disks are not used, but are provided with solid state disks with certain capacity, the reading capability of the solid state disks is outstanding, and the solid state disks have strong data caching capability; some nodes are provided with a large-capacity random access memory RAM and a multi-core central processing unit CPU, so that the method has great advantages for massive concurrent random read/write operations;
in the embodiment, a non-relational database overall architecture under a distributed system is established in the form of fig. 2; in the service application layer, an external application program is used as a client to perform interactive operation with a user and obtain the requirements of the user; these applications include browser-based Web applications, desktop level applications, mobile device applications, and the like; the application program comprises a large amount of data reading/writing requirements and needs to make data requests to a background database; these read/write requirements are coupled to the agent layer of the distributed system through directional focusing of the application program interface;
in the agent layer of the distributed system, the routing station needs to process a large number of concurrent read/write responses; the number of read/write requests of the part reaches thousands of times/second for a basic level application layer, even reaches thousands of times/second in large-scale application; therefore, the routing site preferentially needs to be configured with a multi-core processor, high-speed Random Access Memory (RAM); further, for a distributed system, each node may run multiple services simultaneously, and thus the load capacity may fluctuate for a period of time; therefore, the control method periodically tests the nodes serving as the routing sites, or periodically monitors and tests the load, and selects at least one node which appropriately serves as the routing site;
wherein, the routing station at least needs to realize the following functions: 1. responding to a demand issued by the business application layer; 2. categorizing the demand, e.g., whether the demand relates to read/write, or is a read-only demand; 3. monitoring the current load performance of each participating node in the distributed system, and carrying out workload load balancing on the master site and the slave sites to ensure that the service requirement of an application layer can be maximally digested by the distributed system;
further, for a node server configured with a large-capacity random access memory RAM and having a disk array or even a solid-state disk array as a second-level or more cache in a system, it may be preferable to have a better write-in capability, and deploy a write mechanism of a multi-level cache, for example, according to a manner of a Nginx local cache, a distributed cache, and a Tomcat heap cache, a write request for a key-value pair < key-value > is cached in a storage element having a high-speed write-in capability first, and an address of a current address block is clarified; when the subsequent write queue is reduced, writing the data into the storage server;
wherein, the data block diagram is as shown in fig. 3; storing data in a mechanical hard disk, a solid state hard disk or a Random Access Memory (RAM), wherein a storage space is divided into a plurality of data blocks, each data block is used as a carrier unit for recording data, has a unique address and is recorded in a block header (block header) of the data block; the other part of the data block is used for storing data, including a zero area where no data is written and an area where data is written; each data block may have a capacity from 4KB to 32KB or more, and may internally store more than one of the key-value pairs < key-value >;
further, the storage server mainly comprises a mechanical hard disk deployment database; the method takes mass storage as a focus point and a storage node for converting a non-relational database into a relational database so as to enable the database to have a long-term storage or cold backup function; the storage server and the main site perform periodic asynchronous synchronization so as to realize the final consistency of the data in the database;
the above cache design for write data is mainly based on the cost consideration of each node to the server required for establishing the node, so that the random situation is strong, and preferably the routing node needs to effectively monitor and investigate the load of each node of the distributed system;
for random sequence concurrent data writing, although the routing node gives a timestamp to a writing request event, occasionally, due to too close concurrent time or too large concurrent amount, when the same key is written, more than one distributed master station creates a cache space for more than one cache position, and writes different values of the same key; thus, for the same key, two values of value possibly existing in correspondence are stored in two data blocks, and record indexes of two key-value pairs < key-value > are generated;
further, by the verification of the timestamp, according to a verification rule set in the distributed system, the address of the data block of the key-value pair < key-value > is definitely and finally stored; the storage address of the data block referred to herein may be included in a memory, a cache, or a hard disk of a node in the distributed system; the method comprises the steps of confirming a data block of a unique legal key value pair < key-value > of a target key by searching the position of the data block of the target key, and clearing the key value pair < key-value > in the rest data blocks containing the target key, so that the final legal data block of the key value pair < key-value > is ensured, and the final consistency of the key value pair < key-value > is ensured;
furthermore, through the operation of the block chain, the final verification result is ensured to be verified by all nodes in the distributed system with qualification on the chain; the characteristics of the block chain on the data non-tamper property and strong consistency are suitable for using the verified data as a trust endorsement;
further, in this embodiment, only the key value of the key-value pair < key-value > and the address value of the address block are bound and packed; the storage space required for recording the two items of information is greatly smaller than the storage space for storing the complete key value pair < key-value >, so that the system performance consumed during packaging the block information is facilitated, and the efficiency of full-chain verification is greatly improved;
further, in most cases, the proportion of read/write operations to the database is generally that read requests account for the majority; the routing station preferably adjusts the selected number proportion of the main stations according to the actual demand condition of the non-relational database, and balances the load degree of the read/write operation;
furthermore, the writing states of the three states of the data block are identified through the master station, when the key-value in the data block has suspicious abnormality or is being cleared, the writing or reading operation of the data block is refused, the abnormal condition is blocked to a certain degree, the reading/writing function of the abnormal data block is released again after the consistency verification is completed again in the distributed system, and the effective blocking mechanism of the non-relational database is ensured
Next, random sequence read/write operations are performed.
Example two:
this embodiment should be understood to include at least all of the features of any of the foregoing embodiments and further modifications thereon;
since the data of the non-relational database is not the database which organizes the data strictly according to the relational model, and a certain type of key-value pair < key-value > can be read/written by a large number of requests, which is technically called as a high-heat key-value pair < key-value >; the control method can therefore give a higher degree of attention to this type of high-heat key-value pair < key-value >; some key-value pairs < key-value > which are requested to be read/written by low frequency can be kept consistent due to the processing of low-heat state, and the control method is also used for adjusting attention;
the distributed system carries out statistical recording on all keys which are executed with reading/writing in a certain fixed time period, such as one day or 12 hours, and obtains an execution times ranking list of one key in the statistical period; sorting the key with the number of execution times being 20% of the key in the ranking list as a high-heat key value pair < key-value >; preferably, the storage position of the high-heat key is relatively fixed in one or several data blocks, and the data block storing the high-heat key-value pair < key-value > is regularly established in a cache, including a high-speed random access memory RAM or a high-speed disk array, so that the read/write operation on the high-heat key-value pair < key-value > can be carried out in the cache;
further, in addition to giving the timestamp to each write record of the high-heat-degree key-value pair < key-value >, a version number is established for the high-heat-degree key-value pair < key-value > at each update, and the timestamps are in one-to-one correspondence with the version numbers; when consistency examination is carried out on the key-value pair with high heat degree, except that the final legal value of the last written record of the key-value pair can be judged according to the last written record of the key-value pair, and the key-value pair can be subjected to version backtracking according to the relation between the timestamp and the version number, the value of the key-value pair after consistency verification from a certain version number is found out to be used as the starting point, the legal writing operation is carried out on the key-value pair from the new time, consistency verification is carried out again, and all the writing operations of the key-value pair after the last consistency verification are rebuilt to the maximum extent;
further, the version number information is also written into the block of the index main chain after full-chain verification is carried out along with the key index directory;
further, after one or more supervision cycles, the key-value pair < key-value > that is not re-appearing in the high-heat ordering is listed as the expired key-value pair < key-value >, the high-heat property thereof is cleared, and the key-value pair is returned to the normal key-value pair < key-value >.
Example three:
this embodiment should be understood to include at least all of the features of any of the embodiments described above and further refinements thereto:
after the high-heat key value pair < key-value > is arranged and stored in a cache position with higher performance of reading/writing speed, the reading/writing speed of the high-heat key value pair < key-value > is suitable for reading/writing operations tens of thousands of times per day, but if the node storing the high-heat key value pair < key-value > or a storage component in the node, such as a memory and a hard disk, has an abnormal fault, is down or a database crashes, data blockage or data disaster can be caused in a short time; the present embodiment therefore optimizes the storage of the high-heat key value pair < key-value > within the cache;
for the data block storing the high-heat key value pair < key-value >, the primary site arranges the secondary level read/write performance cache to perform mirror image type write-in, and the data block column responsible for mirror image write-in is a mirror image data block; the mirror image data block is required to be stored on a physical node different from the original data block, the validity of the mirror image data block is confirmed through the distributed system, and the mapping relation between the mirror image data block and the original data block is recorded by the routing node;
the mirror image data block preferably performs synchronous writing of strong consistency to the original data block according to the actual performance level, that is, at each individual time, the mirror image data block and the original data block both have the same key-value pair < key-value > and are stored inside; or, the mirror image data block and the original data block are asynchronously written, and after a certain time, for example, 10 minutes or 20 minutes, all data of the original data block is mirror-image written into the mirror image data block;
further, when the original data block is abnormal and abnormal read/write operation response is caused, the master station sets the write-in state of the original data block address as a write-in refusal state;
further, the master station queries whether the original data block has the mirror image data block from the routing station, if the mirror image data block exists, the master station feeds back to the routing station, all read/write operations pointing to the block address of the original data block point to the mirror image data block, and performs next-stage backup processing on the key value pair < key-value > in the mirror image data block, for example, calling the key value pair into a relational database for power-off data backup;
by the technical scheme, the consistency of the read/write operation of the random sequence non-relational database is periodically verified, a large number of read/write nodes participate in the verification process, and endorsements are made for the verified consistency.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Although the invention has been described above with reference to various embodiments, it should be understood that many changes and modifications may be made without departing from the scope of the invention. That is, the methods, systems, and devices discussed above are examples. Various configurations may omit, substitute, or add various procedures or components as appropriate. For example, in alternative configurations, the methods may be performed in an order different than that described, and/or various components may be added, omitted, and/or combined. Moreover, features described with respect to certain configurations may be combined in various other configurations, as different aspects and elements of the configurations may be combined in a similar manner. Further, elements therein may be updated as technology evolves, i.e., many elements are examples and do not limit the scope of the disclosure or claims.
Specific details are given in the description to provide a thorough understanding of the exemplary configurations including implementations. However, configurations may be practiced without these specific details, for example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the configurations. This description provides example configurations only, and does not limit the scope, applicability, or configuration of the claims. Rather, the foregoing description of the configurations will provide those skilled in the art with an enabling description for implementing the described techniques. Various changes may be made in the function and arrangement of elements without departing from the spirit or scope of the disclosure.
In conclusion, it is intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that these examples are illustrative only and are not intended to limit the scope of the invention. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.