CN115168366B - Data processing method, data processing device, electronic equipment and storage medium - Google Patents
Data processing method, data processing device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN115168366B CN115168366B CN202211086860.2A CN202211086860A CN115168366B CN 115168366 B CN115168366 B CN 115168366B CN 202211086860 A CN202211086860 A CN 202211086860A CN 115168366 B CN115168366 B CN 115168366B
- Authority
- CN
- China
- Prior art keywords
- data
- storage device
- target
- metadata table
- storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure relates to a data processing method, a data processing device, an electronic device and a storage medium, and relates to the technical field of data processing. The present disclosure at least solves the problem of low synchronous storage efficiency in the related art. The method comprises the following steps: acquiring change data of at least one first metadata table in adjacent periods to generate at least one second metadata table; extracting data corresponding to at least one preset data access requirement from at least one second metadata table, and storing the extracted data in at least one third metadata table; and storing the data in the at least one third metadata table to a plurality of first storage devices. Therefore, the changed data are stored in the plurality of first storage devices in a scattered manner, the throughput of storing the changed data can be transversely improved, the synchronous storage speed and efficiency are further improved, the time consumption of synchronous storage is reduced, and the real-time performance and the processing efficiency of processing such as accessing the target data are finally improved.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
Generally, during data processing, scene data stored in a metadata table, such as a partition Hive table, is usually synchronized into a data storage device, so as to implement low-latency (millisecond level) query of detail information of the scene data through the data storage device.
In the related art, scene data stored in the Hive table are combined in a table join manner to generate a large-width table, and then the scene data in the large-width table is synchronously stored in a data storage device.
Due to the limited throughput of the data storage device, the amount of data stored in a unit time is limited, and the efficiency of the process of synchronizing the scene data to the data storage device is low, which takes a long time, which may take tens of minutes, even several hours, resulting in low real-time performance and processing efficiency of processing such as accessing the scene data.
Disclosure of Invention
The present disclosure provides a data processing method, an apparatus, an electronic device, and a storage medium, so as to at least solve the problems of low data synchronization storage efficiency and long time consumption in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of an embodiment of the present disclosure, there is provided a data processing method, including: acquiring change data of at least one first metadata table in adjacent periods, and generating at least one second metadata table, wherein the at least one first metadata table is in one-to-one correspondence with the at least one second metadata table; extracting data corresponding to at least one preset data access requirement from at least one second metadata table, and storing the extracted data into at least one third metadata table, wherein the corresponding relation between the number of the at least one second metadata table and the number of the at least one third metadata table comprises any one of the following items: one-to-many relationship, many-to-one relationship, and many-to-many relationship, at least one third metadata table corresponding to at least one preset data access requirement one to one; and storing the data in the at least one third metadata table to a plurality of first storage devices, wherein each first storage device is used for storing the data in the at least one third metadata table.
Optionally, the data processing method further includes: after the data in the at least one third metadata table is stored in the plurality of first storage devices, a target first storage device is determined in response to a data access instruction sent by the client, wherein the data access instruction is used for requesting to access the target data, and the target first storage device is a first storage device in the plurality of first storage devices, and the target first storage device stores the target data; and acquiring target data from the target first storage equipment, and sending the target data to the client.
Optionally, "extracting data corresponding to at least one preset data access requirement from at least one second metadata table, and storing the extracted data in at least one third metadata table", includes: acquiring each preset data access requirement in at least one preset data access requirement; sequentially extracting column data corresponding to each preset data access requirement from a second metadata table, wherein one column data corresponds to one data identifier; and reconnecting the extracted column data according to the data identification, and sequentially storing the data corresponding to the connection result to each third metadata table in at least one third metadata table.
Optionally, the data processing method further includes: before the data in the at least one third metadata table is stored in the plurality of first storage devices, adding timestamp information to the data in the third metadata table corresponding to the preset data access requirement under the condition that the preset data access requirement comprises a preset time identifier; the data access instruction also comprises data updating time corresponding to the target data; the method for acquiring the target data from the target first storage device and sending the target data to the client comprises the following steps: under the condition that first abnormal data are stored in a target first storage device, if the data updating time is before the timestamp information corresponding to the first abnormal data, acquiring the target data from the target first storage device, and sending the target data to a client; under the condition that the first abnormal data is stored in the target first storage device, if the data updating time is the same as the timestamp information corresponding to the first abnormal data or the data updating time is behind the timestamp information corresponding to the first abnormal data, acquiring the abnormal data identification information corresponding to the first abnormal data, and sending the abnormal data identification information to the client.
Optionally, the data processing method further includes: backing up and storing the data in the at least one second metadata table into a second storage device; under the condition that the target first storage equipment fails, acquiring target data from the second storage equipment, and sending the target data to the client; and under the condition that the fault of the target first storage equipment is relieved, acquiring target data from the target first storage equipment and sending the target data to the client.
Optionally, the data processing method further includes: under the condition that second abnormal data in the first storage device meet preset conditions, searching abnormal recovery data corresponding to the second abnormal data in the second storage device according to the storage time and the current time of the second abnormal data; storing the exception recovery data as data in a third metadata table to a plurality of first storage devices; the preset condition is that the number of data columns corresponding to the second abnormal data is larger than the preset number.
Optionally, the data processing method further includes: under the condition that the load state of the target first storage device is larger than a preset threshold, determining a cooperative first storage device in the plurality of first storage devices in response to a data access instruction sent by the client, wherein the cooperative first storage device is used for storing all or part of data in the target data; acquiring target data from at least one dynamic first storage device according to the data containing relation, and sending the target data to a client; wherein, the data comprises the following relations: cooperating with the stored data in the first storage device to obtain the inclusion relationship between the stored data and the target data; the at least one dynamic first storage device comprises: collaborate with the first storage device, or collaborate with the first storage device and the target first storage device.
Optionally, the "acquiring target data from at least one dynamic first storage device according to a data inclusion relationship" includes: acquiring target data from the first cooperative storage device under the condition that all data in the target data are stored in the first cooperative storage device; under the condition that the first storage device in cooperation stores all data in the target data, respectively acquiring partial target data from the target first storage device and the first storage device in cooperation according to the load state of the target first storage device; in the case where the cooperative first storage device stores part of the data in the target data, the part of the target data is acquired from the cooperative first storage device and the target first storage device, respectively.
According to a second aspect of the embodiments of the present disclosure, there is provided a data processing apparatus including: the device comprises an acquisition unit, a processing unit and a storage unit; the acquisition unit is used for acquiring the change data of at least one first metadata table in adjacent periods and generating at least one second metadata table, wherein the at least one first metadata table and the at least one second metadata table are in one-to-one correspondence; the processing unit is used for extracting data corresponding to at least one preset data access requirement from at least one second metadata table and storing the extracted data into at least one third metadata table, and the corresponding relation between the number of the at least one second metadata table and the number of the at least one third metadata table comprises any one of the following items: one-to-many relationship, many-to-one relationship, and many-to-many relationship, at least one third metadata table corresponding to at least one preset data access requirement one to one; and the storage unit is used for storing the data in the at least one third metadata table to a plurality of first storage devices, and each first storage device is used for storing the data in the at least one third metadata table.
Optionally, the data processing apparatus further includes: a determination unit and a feedback unit; a determining unit, configured to determine, after the data in the at least one third metadata table is stored in the plurality of first storage devices, a target first storage device in response to a data access instruction sent by the client, where the data access instruction is used to request access to target data, and the target first storage device is a first storage device in the plurality of first storage devices, where the target data is stored; and the feedback unit is used for acquiring the target data from the target first storage equipment and sending the target data to the client.
Optionally, the processing unit is specifically configured to: acquiring each preset data access requirement in at least one preset data access requirement; sequentially extracting column data corresponding to each preset data access requirement from a second metadata table, wherein one column data corresponds to one data identifier; and reconnecting the extracted column data according to the data identification, and sequentially storing the data corresponding to the connection result to each third metadata table in at least one third metadata table.
Optionally, the storage unit is further configured to, before storing the data in the at least one third metadata table in the plurality of first storage devices, add timestamp information to the data in the third metadata table corresponding to the preset data access requirement when the preset data access requirement includes a preset time identifier; in the data processing apparatus, the data access instruction further includes a data update time corresponding to the target data; a feedback unit, specifically configured to: under the condition that first abnormal data are stored in a target first storage device, if the data updating time is before the timestamp information corresponding to the first abnormal data, acquiring the target data from the target first storage device, and sending the target data to a client; under the condition that the first abnormal data is stored in the target first storage device, if the data updating time is the same as the time stamp information corresponding to the first abnormal data or the data updating time is behind the time stamp information corresponding to the first abnormal data, acquiring abnormal data identification information corresponding to the first abnormal data, and sending the abnormal data identification information to the client.
Optionally, the data processing apparatus further includes: the storage unit is further used for backing up and storing the data in the at least one second metadata table into a second storage device; the feedback unit is also used for acquiring target data from the second storage equipment under the condition that the target first storage equipment fails and sending the target data to the client; and the feedback unit is also used for acquiring the target data from the target first storage equipment and sending the target data to the client under the condition that the fault of the target first storage equipment is relieved.
Optionally, the data processing apparatus further includes: the storage unit is further used for searching the abnormal recovery data corresponding to the second abnormal data in the second storage device according to the storage time of the second abnormal data and the current time under the condition that the second abnormal data in the first storage device meets the preset condition; storing the exception recovery data as data in a third metadata table to a plurality of first storage devices; and the preset condition is that the number of data columns corresponding to the second abnormal data is larger than the preset number.
Optionally, the data processing apparatus further includes: the determining unit is further configured to determine a cooperative first storage device in the plurality of first storage devices in response to a data access instruction sent by the client when the load state of the target first storage device is greater than a preset threshold, where the cooperative first storage device is used for storing all or part of data in the target data; the feedback unit is further used for acquiring target data from at least one dynamic first storage device according to the data inclusion relation and sending the target data to the client; wherein, the data comprises the following relations: cooperating with the stored data in the first storage device to obtain the inclusion relationship between the stored data and the target data; the at least one dynamic first storage device comprises: and coordinating the first storage device, or coordinating the first storage device and the target first storage device.
Optionally, the feedback unit is specifically configured to: acquiring target data from the first cooperative storage device under the condition that all data in the target data are stored in the first cooperative storage device; under the condition that the first storage device in cooperation stores all data in the target data, respectively acquiring partial target data from the target first storage device and the first storage device in cooperation according to the load state of the target first storage device; in the case where the cooperative first storage device stores part of the data in the target data, the part of the target data is acquired from the cooperative first storage device and the target first storage device, respectively.
According to a third aspect of an embodiment of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the above instructions to implement the data processing method as provided by the first aspect and any one of its possible design forms.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions of the computer-readable storage medium, when executed by a processor, implement the data processing method as provided in the first aspect and any one of the possible design manners thereof.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the data processing method as provided by the first aspect and any one of its possible designs.
The technical scheme provided by the disclosure at least brings the following beneficial effects: generating at least one second metadata table by acquiring change data of at least one first metadata table in adjacent periods; extracting data corresponding to at least one preset data access requirement from at least one second metadata table, and storing the extracted data into at least one third metadata table; and storing the data in the at least one third metadata table to a plurality of first storage devices. Therefore, the change data are stored in the synchronous storage process, the data volume of synchronous storage in unit time is reduced, the second metadata table is split and recombined through the preset data access requirement, the third metadata table is generated, the third metadata table can meet the data access requirement of a client side, meanwhile, the change data are stored into a plurality of first storage devices in a scattered mode, the throughput of the change data can be transversely improved through the plurality of storage devices, the data storage speed in unit time can be improved, the synchronous storage speed and efficiency are improved, and the time consumption of synchronous storage is reduced. Moreover, only one first storage device corresponding to the third metadata table is accessed, all data required by the client can be obtained, so that the client can access as few first storage devices as possible, that is, the target data can be accessed or obtained, and the efficiency of accessing or obtaining the data is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a block diagram illustrating a data processing system in accordance with an exemplary embodiment;
FIG. 2 is one of the flow diagrams of a data processing method shown in accordance with an exemplary embodiment;
FIG. 3 is a second flowchart illustration of a data processing method according to an exemplary embodiment;
FIG. 4 is a schematic flow diagram illustrating a data store in accordance with an exemplary embodiment;
FIG. 5 is a third flowchart illustration of a method of data processing according to an exemplary embodiment;
FIG. 6 is a fourth flowchart illustration of a method of data processing, according to an example embodiment;
FIG. 7 is a fifth flowchart illustration of a method of data processing, according to an exemplary embodiment;
FIG. 8 is a sixth schematic flow chart diagram illustrating a method of data processing in accordance with an exemplary embodiment;
FIG. 9 is a data flow diagram illustrating a data store in accordance with an exemplary embodiment;
FIG. 10 is a block diagram illustrating a data processing apparatus in accordance with an exemplary embodiment;
fig. 11 is a schematic structural diagram of an electronic device according to an exemplary embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in other sequences than those illustrated or described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosure, as detailed in the appended claims.
In addition, in the description of the embodiments of the present disclosure, "/" indicates an OR meaning, for example, A/B may indicate A or B, unless otherwise specified. "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present disclosure, "a plurality" means two or more than two.
For the data to be processed (e.g. video related data), a User Identification (UID) of the publisher, a video duration, a video playing amount, a video approval amount, a video forwarding amount, a video type, and the like may be included. The video related data can be periodically counted and stored in one or more Hive tables during one period. In order to reprocess the video related data, namely, analyze the video related data and generate a video recommendation scheme, the video recommended to the user according to the video recommendation scheme is improved, and the probability of meeting the user requirements is increased. Specifically, the video related data stored in the Hive table are combined through a table join manner to generate a large-width table, that is, the video related data in one period are stored in the large-width table, and then the video related data in the large-width table are replaced by the video related data in the previous period in a full coverage manner, so that the video related data in the period are synchronously stored in the data storage device. By adopting the data synchronous storage method, as the throughput of a single data storage device is limited, the data volume capable of being stored in unit time is limited, so that the time consumption in the synchronous storage process is long, and the processing of the scene data is not facilitated.
Therefore, the embodiment of the present disclosure provides a data processing method, which adopts incremental synchronous storage instead of full synchronous storage and sets a plurality of data storage devices to solve the problem that throughput and the amount of data that can be synchronously stored in a unit time are limited, thereby improving the real-time performance and the processing efficiency of processing such as accessing data.
The data processing method provided by the embodiment of the disclosure can be applied to a data processing system. Fig. 1 shows a schematic structural diagram of the data processing system. As shown in fig. 1, the data processing system includes: the system comprises a business application layer, an online service layer, an online storage layer and an original data layer.
And the business application layer is used for analyzing and processing the target data again by inquiring the target data.
The online service layer is used for realizing the transverse capacity expansion and the timely loss stopping of the storage cluster through dynamic routing and degradation operation; the dynamic routing is an access path corresponding to the data storage device for storing the target data, and the degradation operation is to search the cooperative first storage device corresponding to the data storage device for storing the target data. The dynamic route is configured in the storage layer and is isolated from service inquiry, so that the stability of the system can be improved.
The online storage layer is used for synchronously storing the original data so as to facilitate the business application layer to realize millisecond delay query of target data;
the original data layer is used for providing original data and is an original source of data which can be inquired by the service application layer;
it should be noted that the data processing system may be applied to an electronic device, specifically, may be a personal intelligent device such as a mobile phone and a tablet computer, or may also be an electronic device such as a notebook computer, a handheld computer, a desktop computer, an ultra-mobile personal computer (UMPC), a server, or may also be another electronic device that can store and process big data, and the device form of the electronic device is not limited herein.
In practical applications, the data processing method provided by the embodiment of the present disclosure is applied to the online storage layer of the data processing system, and is described below with reference to the accompanying drawings, and the method is exemplarily described below by taking an execution subject as a data processing apparatus as an example. The data processing device may be embodied as the electronic device of the above-described various forms.
Fig. 2 is a schematic flow chart of a data processing method according to an embodiment of the present disclosure. As shown in fig. 2, a data processing method provided in an embodiment of the present disclosure includes the following steps 201 to 203.
In the embodiment of the disclosure, at least one first metadata table corresponds to at least one second metadata table one to one. That is, a second metadata table is correspondingly generated according to the change data of any one of the at least one first metadata table in the adjacent period.
In the disclosed embodiment, the first metadata table is used to store raw data. The original data can be video related data such as video duration, video playing amount, video praise amount, video forwarding amount, video type and the like, or advertisement putting related data such as advertisement type, advertisement product, advertisement putting proportion and the like, or sales related data such as selling time, selling channel, selling commodity amount, commodity price and the like.
It should be noted that, the data in the first metadata table is updated according to a preset period. The first metadata table may be a Hive table. Because the data volume of the change data of the first metadata table in the adjacent period (namely the current period and the previous period) is often smaller than that of the first metadata table in the current period, and the data of the current period can be obtained according to the change data and the original data of the previous period, the data can be synchronously stored by adopting a mode of storing the change data. It is understood that the preset period may be 10 minutes, 1 hour, 24 hours, one week or one month, and the period time length of the preset period is not limited in the embodiment of the present disclosure.
For example, assuming that the first metadata table records 10 users and their corresponding user information in the previous cycle, and the purchase behavior data exists in all the 10 users, after the user information and the purchase behavior data are synchronously stored, the secondary purchase behavior data is generated by 3 users. Since the user information of the 3 users is recorded in the previous period, 10 users and their corresponding user information are recorded in the previous period, and the secondary purchasing behavior data is stored synchronously when the purchasing behavior data exists in all the 10 users.
In the embodiment of the present disclosure, in order to facilitate management of change data, change data of the first metadata table in adjacent cycles is integrated, and a second metadata table is generated, and the second metadata table is used for storing the change data. The amount of data in the second metadata table is normally much smaller than the amount of data in the first metadata table.
It will be appreciated that the change data may be an increment or a decrement relative to the previous cycle of data of the first metadata table.
In an embodiment of the present disclosure, a correspondence relationship between the number of the at least one second metadata table and the number of the at least one third metadata table includes any one of: one-to-many relationship, many-to-one relationship, and many-to-many relationship, the at least one third metadata table corresponding one-to-one to the at least one preset data access requirement.
In this embodiment of the present disclosure, the second metadata table may be a Hive table, the second metadata table may have a plurality of column data, and split and reassemble based on a column direction according to a preset data access requirement, so as to generate at least one third metadata table.
Optionally, in this embodiment of the present disclosure, as shown in fig. 3, step 202 may also be implemented by following steps 301 to 303.
Step 303, the data processing device reconnects the extracted column data according to the data identifier, and sequentially stores the data corresponding to the connection result to each third metadata table in the at least one third metadata table.
In the embodiment of the present disclosure, one column data corresponds to one data identifier. The data identifier may be used to identify the column data, may be a random number value, may be a user name, or may be an encrypted sequence generated according to user information. It can be understood that column data corresponding to different columns of the same data source corresponds to the same data identifier; and for column data corresponding to the same column of different data sources, different data identifiers are corresponding to the column data.
In the embodiment of the present disclosure, each preset data access requirement includes a set of partial column data in the second metadata table that the client needs to query. For example, the second metadata table includes a video duration, a video playing amount, a video praise amount, a video forwarding amount, and a video type, and then the preset data access requirement may include the video duration and the video playing amount, the preset data access requirement may further include the video praise amount, the video forwarding amount, and the video type, and the preset data access requirement may further include the video duration, the video playing amount, and the video praise amount. The preset data access requirements can be multiple, and different preset data access requirements can correspond to sets of column data which are not identical.
It should be noted that the data corresponding to the preset data access requirement may be column data in the second metadata table obtained according to data processing experience, may also be column data obtained by counting user input received by the client, and may also be column data randomly selected in the second metadata table, where an obtaining manner of the data corresponding to the preset data access requirement is not limited.
In the embodiment of the present disclosure, the data identifier is used as a dimension field, column data corresponding to a preset data access requirement in the second metadata table is connected according to the data identifier, and data corresponding to a connection result is stored in the third metadata table. The connection mode may be a connection join mode.
Illustratively, as shown in FIG. 4, assume that the first metadata table includes Table A and Table B, where Table A includes column A1 and column A2 and Table B includes column B1 and column B2. And acquiring change data of the first metadata table in adjacent periods, and generating a second metadata table A 'and a second metadata table B', wherein the table A 'comprises a column A1' and a column A2', and the table B' comprises a column B1 'and a column B2'. The column data corresponding to the first preset data access requirement comprises a column A2 'and a column B1', the column data corresponding to the second preset data access requirement comprises a column A2 'and a column B2', accordingly, the third metadata table comprises a table 1 and a table 2, the column data in the first preset data access requirement corresponding table A 'and the column data in the table B' are fused to generate a table 1, and the column data in the second preset data access requirement corresponding table A 'and the column data in the table B' are fused to generate a table 2. Finally, the data in table 1 is stored in the first storage device 1, and the data in table 2 is stored in the first storage device 2.
The technical scheme provided by the disclosure at least brings the following beneficial effects: and the column data corresponding to the preset data access requirement in the second metadata table is reconnected through the data identifier, so that the data in the third metadata table have an incidence relation, correspondingly, the speed of obtaining the third metadata table can be increased, and the speed of synchronously storing the first metadata table can be increased.
In an embodiment of the disclosure, each first storage device is configured to store data in at least one third metadata table.
Optionally, in the data processing method, the first storage device is a key value KV storage device. KV storage is a storage mode of NoSQL (non-relational database), and data in the KV storage is organized, indexed and stored in a key value pair mode. KV stores service data which are suitable for not involving excessive data relation service relations, can effectively reduce the times of reading and writing a disk, has better reading and writing performance, and has lower data query delay. Therefore, under the condition that the first storage device is the key value KV storage device, the first storage device can have high read-write performance, so that the synchronous stored data can be conveniently accessed with low delay, high speed and high efficiency in response to the request of the client.
In an embodiment of the disclosure, each first storage device is configured to store data in at least one third metadata table. The correspondence between the third metadata table and the first storage device may be one-to-one or many-to-one. That is, data in one third metadata table is stored to any one of the first storage devices, or data in a plurality of third metadata tables is stored to any one of the first storage devices. The multiple first storage devices are adopted for synchronous storage, so that the problem of throughput bottleneck (limited throughput) generated in the synchronous storage process of a single data storage device or a single data storage cluster can be solved, and the availability of the data synchronously stored in the first storage devices can be improved.
In one example, based on the example shown in fig. 4, the data in table 1 is stored in the first storage device 1, and the data in table 2 is stored in the first storage device 2.
In another example, based on the example shown in fig. 4, the data in table 1 is stored in the first storage device 1, and the data in table 2 is also stored in the first storage device 1.
The technical scheme provided by the disclosure at least brings the following beneficial effects: generating at least one second metadata table by acquiring change data of at least one first metadata table in adjacent periods; extracting data corresponding to at least one preset data access requirement from at least one second metadata table, and storing the extracted data into at least one third metadata table; and storing the data in the at least one third metadata table to the plurality of first storage devices. Therefore, the change data are stored in the synchronous storage process, the data volume of synchronous storage in unit time is reduced, the second metadata table is split and recombined through presetting data access requirements, a third metadata table is generated, the third metadata table can meet the data access requirements of a client side, meanwhile, the change data are stored in a plurality of first storage devices in a dispersed mode, the throughput of the change data can be transversely improved through the plurality of storage devices, the data storage speed in unit time can be improved, the synchronous storage speed and efficiency are improved, and the consumed time of synchronous storage is reduced. Moreover, only one first storage device corresponding to the third metadata table is accessed, all data required by the client can be obtained, so that the client can access as few first storage devices as possible, that is, the target data can be accessed or obtained, and the efficiency of accessing or obtaining the data is improved.
Optionally, in this embodiment of the present disclosure, after the step 203, as shown in fig. 5, the data processing method further includes the following steps 501 and 502.
In an embodiment of the present disclosure, the data access instruction is used to request access to the target data, and the target first storage device is a first storage device storing the target data in the plurality of first storage devices. It should be noted that, the user sends a data access instruction through the client, and the data access instruction may be input by the user at the client, may be received by the client, and may also be generated according to a preset trigger condition.
It will be appreciated that the target data may be stored in any one of the first storage devices, and there may also be a plurality of first storage devices, whereby the target first storage device may be a plurality of first storage devices. The target first storage device may be determined by dynamic routing.
In the embodiment of the present disclosure, different first storage devices may be recorded according to the configuration file. The configuration file may be generated according to the data in the third metadata table and the corresponding first storage device in the process of performing step 203.
In the embodiment of the disclosure, according to the dynamic routing path, a connection is established with the target first storage device, so as to obtain the target data from the target first storage device.
Further optionally, the first storage device for obtaining the target data may be configured according to a routing policy in combination with a load of the storage device. In the embodiment of the present disclosure, as shown in fig. 6, the step 502 further includes the following steps 601 and 602.
In an embodiment of the present disclosure, all or a portion of the data stored in the target data is used in conjunction with the first storage device. The data comprises the following relations: and coordinating with the stored data in the first storage device to obtain the inclusion relationship between the stored data and the target data. The at least one dynamic first storage device comprises: collaborate with the first storage device, or collaborate with the first storage device and the target first storage device.
It is understood that the data inclusion relationship specifically includes: coordinating the storage data in the first storage device to contain all data in the target data; the storage data in the first storage device is coordinated to contain a part of the data in the target data.
It can also be understood that in the case that the load status of the target first storage device is greater than the preset threshold, the target data cannot be timely obtained through the target first storage device, and thus, the target data is timely obtained through the at least one dynamic first storage device. Wherein the at least one dynamic first storage device comprises: collaborate with the first storage device, or collaborate with the first storage device and the target first storage device.
For example, if the data access instructions sent by the clients in the region C are all used to access data meeting the client access requirements in the synchronous storage data, and it is determined that the target first storage device D is the first storage device D, the target first storage device D needs to respond to all the data access instructions sent by the clients in the region C, which may result in a situation where the load status is greater than a preset threshold, that is, the first storage device D cannot respond to the data access instructions in time. Therefore, the cooperative first storage device may be determined, and the target data may be obtained from the target first storage device and the cooperative first storage device. Of course, the device data cooperating with the first storage device may be plural.
In an embodiment of the present disclosure, the step 602 specifically includes: under the condition that all data in the target data are stored in the first storage equipment, the data processing device acquires the target data from the first storage equipment; under the condition that the first storage device is cooperated to store all data in the target data, the data processing device respectively acquires partial target data from the target first storage device and the first storage device according to the load state of the target first storage device; in the case where the cooperative first storage device stores part of the data in the target data, the data processing apparatus acquires the part of the target data from the cooperative first storage device and the target first storage device, respectively.
In an example, when the cooperative first storage device stores all the data in the target data, since the target first storage device and the cooperative first storage device both store all the data in the target data, the data processing apparatus obtains a response speed of the target first storage device to the data access instruction according to a change of a load state of the target first storage device, obtains a response speed of the cooperative first storage device to the data access instruction, selects the first storage device with a fast response speed to the data access instruction, and obtains the target data.
In another example, in the case where all the data in the target data is stored in the cooperative first storage device, since the target first storage device and the cooperative first storage device both store all the data in the target data, the data processing apparatus determines an amount of data that can be processed by the target first storage device according to a change in a load state of the target first storage device, determines a part of the data in the target data that is acquired by the target first storage device according to the amount of data, and then acquires the remaining part of the data in the target data by the cooperative first storage device.
It should be noted that, in the case where the cooperative first storage device stores part of the target data, the data processing apparatus acquires part of the target data from the cooperative first storage device and the target first storage device, respectively, and may acquire as much part of the target data from the target first storage device as possible mainly by the target first storage device, and acquire part of the target data that cannot be acquired from the target first storage device from the cooperative first storage device.
The technical scheme provided by the disclosure at least brings the following beneficial effects: and determining a data source of the target data according to the load state and the data containing relation, so as to avoid the problem that the data access instruction cannot be responded in time under the condition that the load state of the target first storage equipment is greater than a preset threshold value, and improve the efficiency of data processing.
The technical scheme provided by the disclosure at least brings the following beneficial effects: in the process that the first storage device responds to a data access instruction of the client, the first storage device is determined to cooperate, and target data is obtained from the first storage device or the first storage device and the target first storage device according to the data inclusion relation between the target first storage device, the first storage device and the target data water production. Therefore, the problem that the data access instruction cannot be responded in time under the condition that the load state of the target first storage device is larger than the preset threshold value can be solved, and the data processing efficiency can be improved.
The technical scheme provided by the disclosure can at least bring the following beneficial effects: the method comprises the steps of responding to a data access instruction sent by a client, determining a target first storage device, obtaining target data from the target first storage device, and sending the target data to the client, so that the client can timely and accurately obtain the target data, the real-time performance and the processing efficiency of processing such as access to the target data are improved, and reanalysis or other processing of the target data is facilitated. In addition, as one first storage device corresponding to the third metadata table is accessed, all data required by the client can be obtained, so that the client can access as few first storage devices as possible, that is, the target data can be accessed or obtained, and the efficiency of accessing or obtaining the data can be further improved.
Optionally, in this embodiment of the present disclosure, before the step 203, as shown in fig. 7, the data processing method further includes the following step 701.
Accordingly, in the embodiments of the present disclosure, the first storage device also includes timestamp information therein. The time stamp information is used to identify the generation time of the data in the third metadata table.
Further optionally, in this embodiment of the present disclosure, the data access instruction further includes a data update time corresponding to the target data. As shown in fig. 7, the step 502 can also be implemented by the following step 702 or step 703.
In the embodiment of the present disclosure, the generation time of the data in the third metadata table, that is, the generation time of the data stored in the first storage device is recorded by the time stamp information. In a case where the first abnormal data is stored in the target first storage device, in order to avoid that the acquired target data is the first abnormal data, the data update time corresponding to the target data may be compared with the time stamp information, and if the data update time is before the time stamp information corresponding to the first abnormal data, the data processing apparatus acquires the target data. And if the data updating time is the same as the timestamp information corresponding to the first abnormal data or the data updating time is behind the timestamp information corresponding to the first abnormal data, the data processing device acquires the abnormal data identification information corresponding to the first abnormal data and sends the abnormal data identification information to the client.
It will be appreciated that the first exception data is in the same column as the target data.
The technical scheme provided by the disclosure at least brings the following beneficial effects: and under the condition that the preset data access requirement comprises a preset time identifier, adding time stamp information to the data in the third metadata table, and then judging whether the data recorded in the target first storage device is first abnormal data or not according to the time stamp information. And if the data updating time is before the timestamp information corresponding to the first abnormal data, the target data corresponding to the data access instruction is normal data, and the target data can be acquired. And if the data updating time is the same as the time stamp information corresponding to the first abnormal data or the data updating time is behind the time stamp information corresponding to the first abnormal data, determining that the target data corresponding to the data access instruction is the first abnormal data, acquiring the abnormal data identification information corresponding to the first abnormal data, and sending the abnormal data identification information to the client. In this way, the accuracy of the target data is ensured, so that the accuracy of the data statistical conclusion obtained according to the target data is provided.
Optionally, in this embodiment of the present disclosure, as shown in fig. 8, the data processing method may further include the following steps 801 to 803.
In step 803, when the failure of the target first storage device is resolved, the data processing apparatus acquires the target data from the target first storage device and transmits the target data to the client.
In the embodiment of the present disclosure, in order to avoid the occurrence of an exception during data production and synchronous storage, and during a long data recovery process, as shown in fig. 9, corresponding to fig. 1 and 4, the data processing apparatus may backup and store data in at least one second metadata table in the second storage device, and perform full backup storage, so as to shorten a data synchronization period, stop loss of an influence caused by the exception data, and improve robustness of the synchronous storage system.
It can be understood that, for backup storage to the second storage device, data query is not performed under normal conditions, and therefore, all resources can be used for synchronizing data, so that the data synchronization speed is greatly increased, and if the synchronized data is data to be recovered from abnormal data, the data recovery speed is greatly increased, and abnormal data loss stopping is completed quickly.
It should be noted that, the data in the at least one second metadata table is stored in the plurality of first storage devices, and at the same time, is backed up and stored in the second storage device. The second storage device is hot standby. That is, the plurality of first storage devices and the plurality of second storage devices operate together, and in the case where the first storage device fails, the hot standby (second storage device) immediately assumes the role of the failed device (target first storage device).
The technical scheme provided by the disclosure at least brings the following beneficial effects: for backup storage to the second storage device, data query is not performed under normal conditions, if the target first storage device which provides query normally fails, the query can be dynamically routed to the second storage device, and if the failure of the target first storage device is relieved, the query is switched back to the target first storage device. Therefore, the stability of the synchronously stored data can be ensured, and the robustness of the method for synchronously storing the data is improved. In addition, in the data backup process, the data synchronous storage time is short, the occupied resources are few, and the storage performance is not influenced even if a large amount of data is stored.
Optionally, in the data processing method, when the second abnormal data in the first storage device meets the preset condition, the abnormal recovery data corresponding to the second abnormal data is searched in the second storage device according to the storage time of the second abnormal data and the current time; storing the exception recovery data as data in a third metadata table to a plurality of first storage devices; the preset condition is that the number of data columns corresponding to the second abnormal data is larger than the preset number.
In the embodiment of the present disclosure, when the number of data columns corresponding to the second abnormal data is greater than the preset number, which data in the second storage device needs to be synchronized to the first storage device is determined according to timeliness of the abnormal data and timeliness of the system for discovering the abnormality.
The technical scheme provided by the disclosure at least brings the following beneficial effects: under the condition that the number of data columns corresponding to the second abnormal data is larger than the preset number, according to the storage time and the current time of the second abnormal data, the abnormal recovery data corresponding to the second abnormal data is searched in the second storage device, then the abnormal recovery data is used as data in a third metadata table, the data synchronous storage scheme is multiplexed, hardware resources occupied by the data synchronous storage scheme can be improved, and data processing efficiency such as data storage and data sending is improved.
The foregoing describes aspects of embodiments of the present disclosure primarily from a methodological perspective. The method has the advantages of short time consumption of data synchronous storage, less occupied resources and less influence on the storage performance. It will be appreciated that the data processing apparatus, in order to carry out the above-described functions, comprises at least one of a hardware structure and a software module corresponding to each function. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed in hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The data processing apparatus according to the embodiments of the present disclosure may be divided into functional units according to the above method examples, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that, the division of the units in the embodiment of the present disclosure is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
Fig. 10 is a schematic diagram illustrating a structure of a data processing apparatus according to an exemplary embodiment. Referring to fig. 10, a data processing apparatus provided in an embodiment of the present disclosure includes an obtaining unit 1001, a processing unit 1002, and a storage unit 1003.
An obtaining unit 1001, configured to obtain change data of at least one first metadata table in adjacent cycles, and generate at least one second metadata table, where the at least one first metadata table and the at least one second metadata table are in one-to-one correspondence; for example, as shown in fig. 2, the obtaining unit 1001 may be configured to perform step 201.
The processing unit 1002 is configured to extract data corresponding to at least one preset data access requirement from at least one second metadata table, and store the extracted data in at least one third metadata table, where a correspondence relationship between the number of the at least one second metadata table and the number of the at least one third metadata table includes any one of: one-to-many relationship, many-to-one relationship, and many-to-many relationship, at least one third metadata table corresponding to at least one preset data access requirement one to one; for example, as shown in fig. 2, processing unit 1002 may be configured to perform step 202.
A storage unit 1003, configured to store the data in the at least one third metadata table into a plurality of first storage devices, where each first storage device is configured to store the data in the at least one third metadata table. For example, as shown in fig. 2, the storage unit 1003 may be used to perform step 203.
Optionally, as shown in fig. 10, the data processing apparatus further includes: a determination unit 1004 and a feedback unit 1005.
A determining unit 1004, configured to determine, after the data in the at least one third metadata table is stored in the plurality of first storage devices, a target first storage device in response to a data access instruction sent by the client, where the data access instruction is used to request access to target data, and the target first storage device is a first storage device in the plurality of first storage devices, where the target data is stored; for example, as shown in fig. 5, the determining unit 1004 may be configured to perform step 501.
And a feedback unit 1005, configured to acquire the target data from the target first storage device and send the target data to the client. For example, as shown in fig. 5, a feedback unit 1005 may be used to perform step 502.
Optionally, as shown in fig. 10, the processing unit 1002 is specifically configured to: acquiring each preset data access requirement in at least one preset data access requirement; sequentially extracting column data corresponding to each preset data access requirement from a second metadata table, wherein one column data corresponds to one data identifier; and reconnecting the extracted column data according to the data identification, and sequentially storing the data corresponding to the connection result into each third metadata table in at least one third metadata table. For example, as shown in fig. 3, processing unit 1002 may be configured to perform steps 301 to 303.
Optionally, as shown in fig. 10, the storage unit 1003 is further configured to add timestamp information to the data in the third metadata table corresponding to the preset data access requirement when the preset data access requirement includes a preset time identifier before the data in the at least one third metadata table is stored in the plurality of first storage devices; in the data processing apparatus, the data access instruction further includes a data update time corresponding to the target data; for example, as shown in fig. 7, the storage unit 1003 may be used to perform step 701.
The feedback unit 1005 is specifically configured to: under the condition that first abnormal data are stored in a target first storage device, if the data updating time is before the timestamp information corresponding to the first abnormal data, acquiring the target data from the target first storage device, and sending the target data to a client; under the condition that the first abnormal data is stored in the target first storage device, if the data updating time is the same as the timestamp information corresponding to the first abnormal data or the data updating time is behind the timestamp information corresponding to the first abnormal data, acquiring the abnormal data identification information corresponding to the first abnormal data, and sending the abnormal data identification information to the client. For example, as shown in fig. 7, a feedback unit 1005 may be used to perform step 702 and step 703.
Optionally, as shown in fig. 10, the data processing apparatus further includes:
the storage unit 1003 is further configured to backup and store data in the at least one second metadata table in a second storage device; for example, as shown in fig. 8, the storage unit 1003 may be used to perform step 801.
The feedback unit 1005 is further configured to, when the target first storage device fails, obtain target data from the second storage device, and send the target data to the client; for example, as shown in fig. 8, a feedback unit 1005 may be used to perform step 802.
The feedback unit 1005 is further configured to, in a case where the failure of the target first storage device has been resolved, acquire the target data from the target first storage device and send the target data to the client. For example, as shown in fig. 8, a feedback unit 1005 may be used to perform step 803.
Optionally, as shown in fig. 10, the data processing apparatus further includes:
the storage unit 1003 is further configured to, when the second abnormal data in the first storage device meets a preset condition, search, according to storage time of the second abnormal data and current time, abnormal recovery data corresponding to the second abnormal data in the second storage device; storing the abnormal recovery data as data in a third metadata table to a plurality of first storage devices; the preset condition is that the number of data columns corresponding to the second abnormal data is larger than the preset number.
Optionally, as shown in fig. 10, the data processing apparatus further includes:
the determining unit 1004 is further configured to determine, in response to a data access instruction sent by the client, a coordinated first storage device in the plurality of first storage devices, where the coordinated first storage device is used for storing all or part of data in the target data, if the load state of the target first storage device is greater than a preset threshold; for example, as shown in fig. 6, the determining unit 1004 may be configured to perform step 601.
The feedback unit 1005 is further configured to acquire target data from the at least one dynamic first storage device according to the data inclusion relationship, and send the target data to the client; wherein, the data comprises the following relations: cooperating with the stored data in the first storage device to obtain the inclusion relationship between the stored data and the target data; the at least one dynamic first storage device comprises: and coordinating the first storage device, or coordinating the first storage device and the target first storage device. For example, as shown in fig. 6, a feedback unit 1005 may be used to perform step 602.
Optionally, as shown in fig. 10, the feedback unit 1005 is specifically configured to: acquiring target data from the first cooperative storage device under the condition that all data in the target data are stored in the first cooperative storage device; under the condition that the first storage device in cooperation stores all data in the target data, respectively acquiring partial target data from the target first storage device and the first storage device in cooperation according to the load state of the target first storage device; in the case where the cooperative first storage device stores part of the data in the target data, the part of the target data is acquired from the cooperative first storage device and the target first storage device, respectively.
With regard to the apparatus in the above-described embodiment, the specific manner in which each unit performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
Fig. 11 is a schematic structural diagram of an electronic device provided in the present disclosure, where the electronic device may be the data processing apparatus. As shown in fig. 11, the electronic device may include a processor 1101, a memory 1102 for storing instructions executable by the processor 1101; wherein the processor 1101 is configured to execute the instructions to implement the data processing method in the above embodiment.
Additionally, the electronic device may include a communication bus 1103 and at least one communication interface 1104.
The processor 1101 may be a Central Processing Unit (CPU), a micro-processing unit, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to control the execution of programs in accordance with the disclosed aspects.
The communication bus 1103 is a signal path for transferring information between the above components.
The memory 1102 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disk read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 1102, which may be separate, is coupled to the processor 1101 by a communication bus 1103. The memory 1102 may also be integrated with the processor 1101.
The memory 1102 is used for storing instructions for implementing aspects of the present disclosure, and is controlled by the processor 1101 for execution. The processor 1101 is configured to execute programs or instructions stored in the memory 1102 to implement the functions in the disclosed method.
As an example, in connection with fig. 10, the functions implemented by the acquisition unit 1001, the processing unit 1002, and the storage unit 1003 in the data processing apparatus are the same as those of the processor 1101 in fig. 11.
In particular implementations, processor 1101 may include one or more CPUs, such as CPU0 and CPU1 in fig. 11, as one embodiment.
In a specific implementation, as an embodiment, the electronic device may include a plurality of processors 1101, and each of the processors 1101 may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. Processor 1101 herein may refer to one or more devices, circuits, and/or processing cores that process data, such as computer program instructions.
In particular implementations, the electronic device may also include an output device 1105 and an input device 1106, as one embodiment. The output device 1105 is in communication with the processor 1101 and may display information in a variety of ways. For example, the output device 1105 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 1106 is in communication with the processor 1101 and can accept user input in a variety of ways. For example, the input device 1106 may be a mouse, keyboard, touch screen device or sensing device, etc.
Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of the electronic device, and may include more or fewer components than those shown, or combine certain components, or employ a different arrangement of components. The electronic device in fig. 11 may be a server, a client, or other devices.
In addition, the present disclosure also provides a computer-readable storage medium, on which a program or instructions are stored, which, when executed by a processor, enable an electronic device to perform the data processing method provided in the above embodiment. Alternatively, the readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In addition, the present disclosure also provides a computer program product comprising computer programs/instructions, the computer program product being stored in a non-volatile readable storage medium, the computer program product, when executed by at least one processor, causing an electronic device to perform the data processing method as provided in the above embodiments.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (10)
1. A data processing method, comprising:
acquiring change data of at least one first metadata table in adjacent periods, and generating at least one second metadata table, wherein the at least one first metadata table is in one-to-one correspondence with the at least one second metadata table;
acquiring each preset data access requirement in at least one preset data access requirement;
sequentially extracting column data corresponding to each preset data access requirement from the at least one second metadata table, wherein one column data corresponds to one data identifier;
reconnecting the extracted column data according to the data identification, and sequentially storing data corresponding to the connection result to each third metadata table in the at least one third metadata table;
wherein the correspondence between the number of the at least one second metadata table and the number of the at least one third metadata table includes any one of: a one-to-many relationship, a many-to-one relationship, and a many-to-many relationship, the at least one third metadata table corresponding one-to-one to the at least one preset data access requirement;
and storing the data in the at least one third metadata table to a plurality of first storage devices, wherein each first storage device is used for storing the data in the at least one third metadata table.
2. The data processing method according to claim 1, wherein after storing the data in the at least one third metadata table to the plurality of first storage devices, the method further comprises:
responding to a data access instruction sent by a client, and determining a target first storage device, wherein the data access instruction is used for requesting to access target data, and the target first storage device is a first storage device which stores the target data in the plurality of first storage devices;
and acquiring the target data from the target first storage device, and sending the target data to the client.
3. The data processing method according to claim 2, wherein before storing the data in the at least one third metadata table to the plurality of first storage devices, the method further comprises:
under the condition that the preset data access requirement comprises a preset time identifier, adding timestamp information to data in a third metadata table corresponding to the preset data access requirement;
the data access instruction further comprises data updating time corresponding to the target data; the obtaining the target data from the target first storage device and sending the target data to the client includes:
under the condition that first abnormal data are stored in the target first storage device, if the data updating time is before the timestamp information corresponding to the first abnormal data, acquiring the target data from the target first storage device, and sending the target data to the client;
under the condition that first abnormal data are stored in the target first storage device, if the data updating time is the same as the timestamp information corresponding to the first abnormal data or the data updating time is behind the timestamp information corresponding to the first abnormal data, acquiring abnormal data identification information corresponding to the first abnormal data, and sending the abnormal data identification information to the client.
4. The data processing method of claim 3, wherein the method further comprises:
storing the data in the at least one second metadata table to a second storage device in a backup mode;
under the condition that the target first storage equipment fails, acquiring the target data from the second storage equipment, and sending the target data to the client;
and under the condition that the fault of the target first storage equipment is relieved, acquiring the target data from the target first storage equipment, and sending the target data to the client.
5. The data processing method of claim 4, wherein the method further comprises:
under the condition that second abnormal data in the first storage device meet preset conditions, searching abnormal recovery data corresponding to the second abnormal data in the second storage device according to the storage time of the second abnormal data and the current time;
storing the abnormal recovery data as data in the third metadata table to the plurality of first storage devices;
and the preset condition is that the number of data columns corresponding to the second abnormal data is greater than a preset number.
6. The data processing method of claim 2, wherein the method further comprises:
under the condition that the load state of the target first storage device is larger than a preset threshold value, determining a cooperative first storage device in the plurality of first storage devices in response to a data access instruction sent by a client, wherein the cooperative first storage device is used for storing all or part of data in the target data;
according to the data containing relation, the target data are obtained from at least one dynamic first storage device, and the target data are sent to the client;
wherein, the data comprises the following relations: the storage data in the first storage device is cooperated with the inclusion relationship between the target data;
the at least one dynamic first storage device comprises: the cooperative first storage device, or the cooperative first storage device and the target first storage device.
7. The data processing method according to claim 6, wherein the obtaining the target data from at least one dynamic first storage device according to the data inclusion relationship comprises:
acquiring the target data from the cooperative first storage device under the condition that the cooperative first storage device stores all data in the target data;
under the condition that the cooperative first storage device stores all data in the target data, respectively acquiring partial target data from the target first storage device and the cooperative first storage device according to the load state of the target first storage device;
and when the cooperative first storage device stores part of the target data, respectively acquiring part of the target data from the cooperative first storage device and the target first storage device.
8. A data processing apparatus, comprising: the device comprises an acquisition unit, a processing unit and a storage unit;
the acquiring unit is used for acquiring change data of at least one first metadata table in adjacent cycles and generating at least one second metadata table, wherein the at least one first metadata table and the at least one second metadata table are in one-to-one correspondence;
the processing unit is used for acquiring each preset data access requirement in at least one preset data access requirement; sequentially extracting column data corresponding to each preset data access requirement from the at least one second metadata table, wherein one column data corresponds to one data identifier; reconnecting the extracted column data according to the data identification, and sequentially storing data corresponding to the connection result to each third metadata table in the at least one third metadata table; wherein the correspondence between the number of the at least one second metadata table and the number of the at least one third metadata table includes any one of: a one-to-many relationship, a many-to-one relationship, and a many-to-many relationship, the at least one third metadata table corresponding one-to-one to the at least one preset data access requirement;
the storage unit is configured to store the data in the at least one third metadata table to a plurality of first storage devices, where each first storage device is configured to store the data in the at least one third metadata table.
9. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data processing method of any one of claims 1-7.
10. A computer-readable storage medium, in which instructions that, when executed by a processor, are capable of implementing a data processing method as claimed in any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211086860.2A CN115168366B (en) | 2022-09-07 | 2022-09-07 | Data processing method, data processing device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211086860.2A CN115168366B (en) | 2022-09-07 | 2022-09-07 | Data processing method, data processing device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115168366A CN115168366A (en) | 2022-10-11 |
CN115168366B true CN115168366B (en) | 2023-01-20 |
Family
ID=83481836
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211086860.2A Active CN115168366B (en) | 2022-09-07 | 2022-09-07 | Data processing method, data processing device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115168366B (en) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134698A (en) * | 2019-04-15 | 2019-08-16 | 平安普惠企业管理有限公司 | Data managing method and Related product |
CN111966747A (en) * | 2020-07-23 | 2020-11-20 | 深圳市科脉技术股份有限公司 | Data synchronization method, system, terminal device and storage medium |
CN114064668A (en) * | 2020-08-07 | 2022-02-18 | 伊姆西Ip控股有限责任公司 | Method, electronic device and computer program product for storage management |
CN112115152B (en) * | 2020-09-15 | 2024-02-06 | 招商局金融科技有限公司 | Data increment updating and inquiring method and device, electronic equipment and storage medium |
-
2022
- 2022-09-07 CN CN202211086860.2A patent/CN115168366B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN115168366A (en) | 2022-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109739929B (en) | Data synchronization method, device and system | |
EP3832578A1 (en) | Electronic invoice identifier allocation method, and electronic ticket generating method, device and system | |
US8447757B1 (en) | Latency reduction techniques for partitioned processing | |
JP2019200580A (en) | Decentralized ledger system, decentralized ledger subsystem, and decentralized ledger node | |
US10242388B2 (en) | Systems and methods for efficiently selecting advertisements for scoring | |
CN113515545B (en) | Data query method, device, system, electronic equipment and storage medium | |
CN111552701B (en) | Method for determining data consistency in distributed cluster and distributed data system | |
CN112199427A (en) | Data processing method and system | |
CN106888245A (en) | A kind of data processing method, apparatus and system | |
CN110784498B (en) | Personalized data disaster tolerance method and device | |
CN110740155B (en) | Request processing method and device in distributed system | |
CN111343241B (en) | Graph data updating method, device and system | |
US20190251096A1 (en) | Synchronization of offline instances | |
CN114443908A (en) | Graph database construction method, system, terminal and storage medium | |
CN113468226A (en) | Service processing method, device, electronic equipment and storage medium | |
WO2017157111A1 (en) | Method, device and system for preventing memory data loss | |
US20180121532A1 (en) | Data table partitioning management method and apparatus | |
CN111935320A (en) | Data synchronization method, related device, equipment and storage medium | |
CN115168366B (en) | Data processing method, data processing device, electronic equipment and storage medium | |
CN117155930A (en) | Node determining method, task processing method and related devices of distributed system | |
US7260611B2 (en) | Multi-leader distributed system | |
CN116107801A (en) | Transaction processing method and related product | |
CN115982133A (en) | Data processing method and device | |
CN113872994B (en) | Organization architecture synchronization method, device, computer equipment and storage medium | |
CN113886500A (en) | Data processing method, device, server and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |