CN113434506B - Data management and retrieval method, device, computer equipment and readable storage medium - Google Patents
Data management and retrieval method, device, computer equipment and readable storage medium Download PDFInfo
- Publication number
- CN113434506B CN113434506B CN202110724252.9A CN202110724252A CN113434506B CN 113434506 B CN113434506 B CN 113434506B CN 202110724252 A CN202110724252 A CN 202110724252A CN 113434506 B CN113434506 B CN 113434506B
- Authority
- CN
- China
- Prior art keywords
- data
- index
- meta
- nodes
- metadata
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013523 data management Methods 0.000 title claims abstract description 36
- 238000000605 extraction Methods 0.000 claims abstract description 24
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 9
- 238000004140 cleaning Methods 0.000 claims description 8
- 238000012217 deletion Methods 0.000 claims description 8
- 230000037430 deletion Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 abstract description 8
- 238000004458 analytical method Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 239000002253 acid Substances 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012946 outsourcing Methods 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2264—Multidimensional index structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of big data, and discloses a data management and retrieval method, a device, computer equipment and a readable storage medium, wherein the method comprises the following steps: acquiring original data from an original database according to the extraction request; storing the original data into a graph database and storing the original data in a data node form, arranging metadata of the original data to obtain a metadata structure, and associating the metadata structure with the data node to obtain a data index; and receiving index keywords sent by the user side, traversing meta-nodes corresponding to the index keywords in the data index, setting the meta-nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and taking the original data as feedback data. The present invention also relates to blockchain techniques in which information may be stored in blockchain nodes. The invention not only avoids the problem of overlarge calculation power consumption and improves the acquisition efficiency of the feedback data, but also ensures the comprehensiveness of the feedback data so as to meet the retrieval requirement.
Description
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a method and apparatus for managing and retrieving data, a computer device, and a readable storage medium.
Background
The current database comprises a relational database and a non-relational database, wherein the non-relational database is a data storage system which is non-relational, distributed and generally does not guarantee to follow an ACID principle, so that the non-relational database is stored in a key value pair and is not fixed in structure, each tuple can have different fields, each tuple can be added with a plurality of key value pairs according to the need, and the non-relational database is not limited to a fixed structure, and can reduce the expenditure of time and space; the relational database refers to a database which adopts a relational model to organize data, the relational model refers to a two-dimensional table model, and one relational database refers to a data organization consisting of two-dimensional tables and relations between the two-dimensional tables.
When a user needs to query specified data in a database and obtain corresponding feedback data, the inventor realizes that for a non-relational database, a search keyword sent by a user side needs to be matched with each main key, and a key value corresponding to the matched main key is used as feedback data to be sent to the user side; for a relational database, metadata in the relational database are required to be traversed according to a search keyword sent by a user side so as to obtain data corresponding to the search keyword and take the data as feedback data, and the query mode leads to the fact that a server needs to consume a large amount of calculation power to traverse all data in the database, so that query efficiency is low, and calculation power consumption of the server is huge.
Disclosure of Invention
The invention aims to provide a data management and retrieval method, a device, computer equipment and a readable storage medium, which are used for solving the problems that in the prior art, feedback data obtained by a user side is difficult to meet the retrieval requirement of a user and a large amount of calculation power is consumed to traverse all data in a database due to the fact that the quality of retrieval keywords is required to be high, so that the query efficiency is low and the calculation power consumption of a server is huge.
In order to achieve the above object, the present invention provides a data management and retrieval method, including:
receiving an extraction request sent by a user terminal, and acquiring original data from an original database according to the extraction request;
storing the original data into a preset graphic database and storing the original data in a data node form, arranging metadata of the original data to obtain a metadata structure reflecting the association relation between the metadata in a metadata node form, and associating the metadata structure with the data node to obtain a data index reflecting the association relation between the original data and the metadata; wherein, the graph database is a non-relational database for storing the relational information among entities through graph theory;
Receiving index keywords sent by a user terminal, traversing meta nodes corresponding to the index keywords in the data indexes, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data as feedback data to the user terminal; the direct association refers to a situation that the meta node and the data node directly have an association relationship, and the indirect association refers to a situation that the meta node and the data node have an association relationship formed by other meta nodes or data nodes.
In the above solution, after the obtaining the original data from the original database according to the extraction request, the method further includes:
and cleaning the original data to delete invalid data and metadata thereof, and deleting values and metadata thereof in the original data.
In the above solution, the arranging metadata of the original data to obtain a meta structure reflecting association relations between the metadata in a form of meta nodes includes:
establishing a dimension data tree reflecting the logical relation between meta-categories, and constructing a structure table taking the meta-categories as classification items according to the dimension data tree; wherein the meta category is category information for classifying metadata so that the metadata forms a hierarchical relationship;
Extracting metadata of original data in the graphic database, inputting the metadata and corresponding information thereof in the original data into a category item corresponding to the metadata in the structure table to obtain a dimension table;
and constructing meta nodes representing the meta categories and the meta data thereof according to the dimension table, and constructing association relations among the meta nodes according to the logic relations among the meta categories to obtain a meta structure.
In the above solution, the associating the meta structure with the data node to obtain a data index reflecting an association relationship between the original data and the meta data includes:
extracting metadata of original data in the graphic database and corresponding information of the metadata in the original data, obtaining a meta node corresponding to the metadata from the meta structure, and setting the meta node as a target node;
and arranging the target nodes according to the logical relation in the dimension table to form a node chain, and associating the target nodes positioned at the tail ends of the node chain with the data nodes to obtain the data index.
In the above solution, before the receiving the index keyword sent by the user side, the method further includes:
Extracting meta-nodes in the data index;
extracting meta-categories in the meta-nodes, setting category input boxes for the meta-categories, constructing category index pages with the meta-categories and the category input boxes, and sending the category index pages for inputting the index keywords to a user side; or (b)
Extracting metadata in the metadata node, setting a metadata input box for the metadata, constructing a meta index page with the metadata and the metadata input box, and sending the meta index page for inputting the index keywords to a user side.
In the above solution, before the receiving the index keyword sent by the user side, the method further includes:
extracting meta nodes in the data index, extracting meta categories and meta data of the meta nodes, setting a selection frame for enabling and disabling the meta data for the meta data, constructing a search input frame associated with the selection frame, creating an optional index page with the meta categories, the meta data and the selection frame thereof and the search input frame associated with the selection frame, and sending the optional index page for inputting the index keywords to a user side.
In the above solution, after the obtaining the data index reflecting the association relationship between the original data and the metadata, the method further includes:
receiving index keywords, index target words and index destination information sent by a user side, identifying meta nodes in the data index according to the index keywords, setting the meta nodes as index nodes, identifying meta nodes or data nodes in the data index according to the index destination words, setting the meta nodes or the data nodes as destination nodes, identifying the number of destination nodes directly associated and/or indirectly associated with the index nodes according to the index destination information, and sending the number to the user side;
after the sending the number to the user terminal, the method further includes:
the number of target nodes is uploaded into the blockchain.
In order to achieve the above object, the present invention further provides a data management and retrieval device, including:
the data input module is used for receiving an extraction request sent by a user terminal and acquiring original data from an original database according to the extraction request;
the index construction module is used for storing the original data into a preset graphic database and storing the original data in a data node form, arranging metadata of the original data to obtain a meta structure reflecting the association relation between the metadata in a meta node form, and associating the meta structure with the data node to obtain a data index reflecting the association relation between the original data and the metadata; wherein, the graph database is a non-relational database for storing the relational information among entities through graph theory;
The retrieval feedback module is used for receiving index keywords sent by a user side, traversing meta nodes corresponding to the index keywords in the data index, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data to the user side as feedback data; the direct association refers to a situation that the meta node and the data node directly have an association relationship, and the indirect association refers to a situation that the meta node and the data node have an association relationship formed by other meta nodes or data nodes.
To achieve the above object, the present invention also provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the steps of the above data management and retrieval method are implemented when the processor of the computer device executes the computer program.
In order to achieve the above object, the present invention further provides a computer readable storage medium having a computer program stored thereon, the computer program stored on the readable storage medium implementing the steps of the above data management and retrieval method when executed by a processor.
According to the data management and retrieval method, the device, the computer equipment and the readable storage medium, the technical effect of pertinently extracting the original data in the original database is realized by acquiring the corresponding original data from the original database according to the extraction request;
the metadata of the original data is arranged to obtain a metadata structure, and a data index reflecting the association relation between the original data and the metadata is constructed according to the metadata structure and the data nodes, so that only the nodes corresponding to the index keywords are required to be obtained from the data index and set as index nodes to obtain the data nodes, the original data in the data nodes are set as feedback information, the association relation between the original data and the metadata is quickly identified, and the technical effect of retrieving the original data associated with the original data through the metadata according to the requirements of a user side is achieved;
the metadata is identified and obtained based on the index keywords, and the original data corresponding to the metadata is used as feedback data, so that the problem that the calculation power of a server is excessively consumed due to all metadata in a convenient database and the corresponding original data is avoided, the obtaining efficiency of the feedback data is improved, and the feedback data related to the index keywords can be comprehensively fed back without controlling the quality of the index keywords due to the mode of obtaining all the original data related to the index keywords based on the index keywords, the comprehensiveness of the feedback data is guaranteed, and the feedback data obtained by a user side can meet the searching requirement.
Drawings
FIG. 1 is a flowchart of a data management and retrieval method according to a first embodiment of the present invention;
FIG. 2 is a schematic view of an environment application of a data management and retrieval method according to a second embodiment of the present invention;
FIG. 3 is a flowchart illustrating a data management and retrieval method according to a second embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a program module of a third embodiment of a data management and retrieval device according to the present invention;
fig. 5 is a schematic hardware structure of a computer device in a fourth embodiment of the computer device of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The data management and retrieval method, the device, the computer equipment and the readable storage medium are applicable to the technical field of data processing of big data, and are based on a data input module, an index construction module and a data retrieval module. According to the method, the original data are obtained from an original database according to the received extraction request, the original data are stored in a preset graphic database and are stored in the form of data nodes, metadata of the original data are arranged to obtain a meta structure reflecting the association relation between the metadata in the form of the meta nodes, the meta structure is associated with the data nodes, and a data index reflecting the association relation between the original data and the metadata is obtained; and receiving index keywords sent by the user terminal, traversing meta nodes corresponding to the index keywords in the data index, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data to the user terminal as feedback data.
Embodiment one:
referring to fig. 1, a data management and retrieval method of the present embodiment includes:
s101: and receiving an extraction request sent by a user terminal, and acquiring original data from an original database according to the extraction request.
S103: storing the original data into a preset graphic database and storing the original data in a data node form, arranging metadata of the original data to obtain a metadata structure reflecting the association relation between the metadata in a metadata node form, and associating the metadata structure with the data node to obtain a data index reflecting the association relation between the original data and the metadata; wherein the graphic database is a non-relational database storing relational information between entities through graphic theory.
S106: receiving index keywords sent by a user terminal, traversing meta nodes corresponding to the index keywords in the data indexes, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data as feedback data to the user terminal; the direct association refers to a situation that the meta node and the data node directly have an association relationship, and the indirect association refers to a situation that the meta node and the data node have an association relationship formed by other meta nodes or data nodes.
In an exemplary embodiment, corresponding original data is obtained from the original database according to an extraction request sent by a user side, where the extraction request has a data name and a data type and is used for extracting original data corresponding to the data name and the data type in the original database, so that a technical effect of extracting the original data in the original database in a targeted manner is achieved.
The original data are stored in a preset graphic database, metadata of the original data in the graphic database are arranged to obtain a meta structure, a data index reflecting the association relation between the original data and the metadata is constructed according to the meta structure and data nodes, wherein the obtained data index reflects the association relation between the original data, therefore, only nodes corresponding to index keywords are needed to be obtained from the data index and set as index nodes, so that data nodes directly associated and/or indirectly associated with the index nodes are conveniently obtained, and the original data in the data nodes are set as feedback information; the method and the device realize the rapid identification of the association relation between the original data and the metadata, so as to be convenient for retrieving the technical effect of the original data associated with the metadata through the metadata according to the requirement of a user side.
The metadata is identified and obtained based on the index keywords, and the original data corresponding to the metadata is used as feedback data, so that the problem that the calculation power of a server is excessively consumed due to all metadata in a convenient database and the corresponding original data is avoided, the obtaining efficiency of the feedback data is improved, and the feedback data related to the index keywords can be comprehensively fed back without controlling the quality of the index keywords due to the mode of obtaining all the original data related to the index keywords based on the index keywords, the comprehensiveness of the feedback data is guaranteed, and the feedback data obtained by a user side can meet the searching requirement.
Embodiment two:
the present embodiment is a specific application scenario of the first embodiment, and by this embodiment, the method provided by the present invention can be more clearly and specifically described.
Next, the method provided in this embodiment will be specifically described by storing and arranging metadata of original data in the form of data nodes in a server running a data management and retrieval method, obtaining a data index reflecting an association relationship between the original data and the metadata, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting the original data in the data nodes, and taking the extracted original data as feedback data as an example. It should be noted that the present embodiment is only exemplary, and does not limit the scope of protection of the embodiment of the present invention.
Fig. 2 schematically illustrates an environment application diagram of a data management and retrieval method according to a second embodiment of the present application.
In the exemplary embodiment, the server 2 where the data management and retrieval method is located is connected to the client 4 through the network 3; the server 2 may provide services through one or more networks 3, and the networks 3 may include various network devices such as routers, switches, multiplexers, hubs, modems, bridges, repeaters, firewalls, proxy devices, and/or the like. The network 3 may include physical links such as coaxial cable links, twisted pair cable links, fiber optic links, combinations thereof, and/or the like. The network 3 may include wireless links, such as cellular links, satellite links, wi-Fi links, and/or the like; the client 4 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, or other computer devices.
Fig. 3 is a flowchart of a specific method of a data management and retrieval method according to an embodiment of the present invention, and the method specifically includes steps S201 to S207.
S201: and receiving an extraction request sent by a user terminal, and acquiring original data from an original database according to the extraction request.
In order to achieve targeted extraction of the original data in the original database, the step obtains corresponding original data from the original database according to an extraction request sent by a user side, wherein the extraction request is provided with a data name and a data type and is used for extracting the original data corresponding to the data name and the data type in the original database.
Illustratively, the raw data is data information describing an entity, and descriptions of raw data by different systems or platforms typically employ metadata of different dimensions, where the dimensions may include: subsystem dimension, application runtime dimension, deployment logic entity dimension, cluster dimension, instance dimension, host dimension, storage dimension, DB entity dimension, oracle dimension, redis dimension, mysql/postgresql dimension.
The metadata corresponding to the metadata category subsystem comprises: a main key (uuid), a subsystem ID (idItmisSubSystem), an english name (enNameAbbr), an english name (enName), a chinese name (cnName), a system status (status), an importance level (important grade), whether outsourcing (isOutsourcing), a vendor (suppler), whether extranet access (iseinternet), a system architecture (frame), a development language (devLanguage), a service window (serviceWindow), and a subsystem profile (subsysDesc).
The metadata corresponding to the deployment logic entity in the meta category comprises: a primary key (uuid), a subsystem primary key (subsysId), an IAMS LE primary key (leId), a deployment subsystem (dssId), a subsystem code (subsymcode), a deployment entity code (leCode), a deployment entity name (leName), a security zone name (zonername), a LE type (typeName), a database type (dbType), a call type (invokeType), a state (stateName), a subsystem state (subsymtatus), a type (leType).
The metadata corresponding to the application running environment by the meta category comprises: a main key (uuid), an application english name (enName), an application chinese name (arenamercn), an environment type (environment), a deployment name (ARE name), a status (status), whether to apply standard (iscipline), and non-standardization (non-standardization).
The metadata corresponding to the meta category for the cluster comprises: a main key (uuid), cluster name (clusteriname), deployment unit name (depoyname), cluster software (clusterift), HA type (haType), status (status), availability zone (availabilityZone), network area (area), cluster VIP (VIP), VIP port (vipPort), IP type (ipType), DNS (DNS), DNS type (dnsType), https (Https), config (Config), user type (ownerType).
The metadata corresponding to the meta category as the instance comprises: a primary key (uuid), an instance name (instanceName), an instance port (port), a protocol (portocol), a middleware type (type).
The metadata corresponding to the metadata category as the host comprises: a home key (uuid), a subsystem standard name (hostsubsystemmname), an OS name (hostName), NAT IP (natIp), IP (ipAddress), NAS IP (nasIp), OS class (hostType), host area (middleware specific) (hostArea), local disk (GB) (localdskkb), and foreign IP (hostout IP).
The meta-category storing the corresponding meta-data includes: a primary key (uuid), a volume name (volumeName), a format (format), a path (path), a capacity (capability), a storage name (storage name), a storage type (storage type), a storage path (storage path), and whether to mount (ismount).
The metadata corresponding to the metadata category DB entity comprises: a primary key (uuid), a database entity name (entity name), an entity english name (englishDesc), an entity chinese name (chineseDesc), a database block size (blockSize), whether there is a backup (isBackup), whether there is GG (isGg), a database type (databaseType), a character set (characterSet), a data version (dataVersion).
Metadata corresponding to the metadata category Oracle comprises: a primary key (uuid), a database name (databaseName), an environment type (environment), a database unique name (uniqueName), a state (Status), an architecture type (architecture name), a storage type (storage type), a database block size (blockSize), db_ Domain (dbDomain), db id (dbid).
Metadata corresponding to the meta category of Redis comprises: a primary key (uuid), a database name (databaseName), an instance name (instanceName), an instance role (defaultresolution), an instance port (instancePort), an instance state (status), an environment type (environment), an HA type (haType), an instance character set (characterSet), VIP (serverVip), a high availability architecture (iskeepalive).
Metadata for mysql/postgresql metadata includes: a primary key (uuid), a database name (databaseName), an instance name (instanceName), an instance role (defaultRole), an instance port (instancePort), an instance state (status), an environment type (environment), an HA type (haType), an instance character set (characterSet), VIP (serverVip).
In this embodiment, the primary database is a unified CMDB and application mapping (UCMDB) module (UCMDB is commonly referred to as Universal CMDB), which is a support module at the bottom layer of the service availability center, and is mainly used for managing and storing all configuration management information. Thus, the configuration information of each resource item can be permanently stored in the CMDB, and the accuracy of the data can be ensured through the inspection of tools, such as the discovery of the change of the system configuration.
S202: and cleaning the original data to delete invalid data and metadata thereof, and deleting values and metadata thereof in the original data.
In order to avoid the influence of invalid data and missing values on the data indexing operation, the step is to clean the original data to delete the invalid data, the missing values and the metadata thereof, so as to avoid the condition that the index searching accuracy is reduced due to the fact that the original data with the invalid data or the missing values is obtained when the index data corresponding to the metadata input by the user terminal is the invalid data or the null values.
In a preferred embodiment, the step of performing data cleansing on the raw data includes:
s21: and identifying invalid values in the original data, inquiring metadata corresponding to the invalid values, and deleting the invalid values of the original data and the metadata thereof.
In the step, an invalid standard value is obtained, the invalid standard value is compared with a data value in the original data, a data value consistent with the invalid standard value is set as an invalid value, metadata corresponding to the invalid value is obtained, and the invalid value and the metadata thereof are deleted, so that the influence of the invalid value on the data query operation is avoided.
S22: and identifying the missing value in the original data, inquiring the metadata corresponding to the missing value, and deleting the missing value and the metadata thereof in the original data.
In the step, the data value with the null value in the original data is identified and set as the missing value, the metadata corresponding to the missing value is obtained, and the missing value and the metadata are deleted, so that the influence of the missing value on the data query operation is avoided.
S203: storing the original data into a preset graphic database and storing the original data in a data node form, arranging metadata of the original data to obtain a metadata structure reflecting the association relation between the metadata in a metadata node form, and associating the metadata structure with the data node to obtain a data index reflecting the association relation between the original data and the metadata; wherein the graphic database is a non-relational database storing relational information between entities through graphic theory.
In order to identify the association relationship between the original data and the metadata, so as to conveniently retrieve the original data associated with the original data through the metadata according to the requirement of a user side, the step is to store the original data into a preset graphic database, arrange the metadata of the original data in the graphic database to obtain a metadata structure, and construct a data index reflecting the association relationship between the original data and the metadata according to the metadata structure and the data nodes, wherein the obtained data index reflects the association relationship between the original data, therefore, only the nodes corresponding to the index keywords need to be acquired from the data index and are set as index nodes so as to conveniently acquire the data nodes directly associated and/or indirectly associated with the index nodes, and set the original data in the data nodes as feedback information; the direct association refers to that the two nodes directly have an association relationship, and the indirect association relationship refers to that the two nodes can generate an association relationship through other nodes. At this time, the original data in the feedback node is sent to the user side as feedback data.
It should be noted that the graphic database is a type of NoSQL database, which applies relationship information between graphic theory storage entities. The graph database is a non-relational database that uses graph theory to store relationship information between entities.
In a preferred embodiment, arranging metadata of the original data to obtain a meta structure reflecting an association relationship between the metadata in the form of a meta node includes:
s31: establishing a dimension data tree reflecting the logical relation between meta-categories, and constructing a structure table taking the meta-categories as classification items according to the dimension data tree; wherein the meta category is category information for classifying metadata such that the metadata forms a hierarchical relationship.
In this step, the dimension data tree is constructed by a developer at the user end, and is used for reflecting the tree-shaped data structure of the logic relationship between the meta-categories, and the dimension data tree is input into a preset relational database to obtain a structure table displayed in a form of a table. Meanwhile, meta-categories are built according to requirements, and a dimension data tree reflecting the logical relation among the meta-categories is built, so that each piece of meta-data can be arranged according to the requirements of users, and data indexes conforming to the use scenes of the user side can be built conveniently.
Illustratively: the dimension data tree includes a first level, a second level, a third level, and a fourth level; the first-level meta-category is a subsystem category, the second-level meta-category is a logical entity category, the third-level meta-category comprises a database category, a middleware category and a load category, and the fourth-level meta-category is an instance category.
The metadata corresponding to the subsystem category comprises: english name, chinese name, allocation proportion, application state and center classification;
the metadata corresponding to the logical entity category comprises: leCode, le description, network area
The metadata corresponding to the database category comprises: database entity name, entity Chinese name, database type, database name, detailed version, database environment, database instance name, database domain name, database instance port, database VIP, and supervisor DA;
the metadata corresponding to the middleware category comprises: middleware type (type), host region (middleware specific);
the metadata corresponding to the load category includes: cluster name, deployment environment, environment type;
the metadata corresponding to the instance category includes: application instance name, application instance state, application hostname, application hostip, hostenvironment.
S32: and extracting metadata of original data in the graphic database, and inputting the metadata and corresponding information thereof in the original data into a category item corresponding to the metadata in the structure table to obtain a dimension table.
In the step, original data are acquired from each data node in the graphic database, metadata in the original data are extracted, and the metadata are input into meta-category items corresponding to the categories in the structural table.
Illustratively, metadata corresponding to a subsystem in the original data and the corresponding information thereof are input into a subsystem item in the structure table; metadata corresponding to the logical entity in the original data and the corresponding information are input into the logical entity item in the structure table; metadata corresponding to a database in the original data and the corresponding information are input into database items in the structure table; metadata corresponding to middleware in the original data and the corresponding information are input into an intermediate item in the structure table; and inputting metadata corresponding to the load in the original data and the corresponding information into a load item in the structure table.
S33: and constructing meta nodes representing the meta categories and the meta data thereof according to the dimension table, and constructing association relations among the meta nodes according to the logic relations among the meta categories to obtain a meta structure.
In the step, meta-class items in the dimension table are obtained, meta-nodes representing meta-data under the meta-class items are constructed, and according to the logic relationship among the meta-classes in the dimension table, the logic relationship among the meta-nodes is constructed, and a meta-structure is formed.
Illustratively, the classifications of subsystem class, logical entity class, database class, middleware class, load class and instance class are respectively classified according to the corresponding information of each metadata in the original data, and for convenience of understanding, the corresponding information is respectively classified according to serial numbers of 1, 2 and 3, for example: if the metadata of the subsystem category is "Chinese name" in the dimension table, and the corresponding information is system 1, system 2 and system 3, then the subsystem category is respectively constructed: chinese name: system 1, subsystem class: chinese name: system 2, subsystem class: chinese name: meta-nodes of system 3. Further, when the meta structure is constructed, the meta category is used as a primary node, the meta data under the meta category is used as a secondary node, and the corresponding information of the meta data is used as an associated connection line of the secondary node, so as to connect the primary node or the data node of the next level, for example: the subsystem class is taken as a primary node, the Chinese name is taken as a secondary node, and the system 1, the system 2 and the system 3 are respectively taken as the associated connection lines of the secondary nodes and are used for connecting the primary nodes of the logic entity class; by analogy, the resulting meta structure is: { subsystem class: chinese name: system 1, system 2, system 3} - { logical entity class: leCode: entity 1, entity 2, entity 3} - { database category: database entity name: database 1, database 2, database 3, middleware class: middleware type: middleware 1, middleware 2, load class: cluster name: load 1, load 2} - { instance category: application instance name: example 1, example 2).
In a preferred embodiment, said associating said meta-structure with said data node to obtain a data index reflecting an association relationship between said original data and said meta-data comprises:
s34: and extracting metadata of original data in the graphic database and corresponding information of the metadata in the original data, obtaining a meta node corresponding to the metadata from the meta structure, and setting the meta node as a target node.
In this step, metadata of the original data is acquired, and metadata nodes corresponding to the metadata are sequentially acquired from the graphic database.
Illustratively, it is assumed that the corresponding information of the chinese name metadata of the raw data in the subsystem class is: the system 1, the corresponding information of the leCode metadata in the logical entity category is: entity 2, corresponding information of database entity names in database categories is: the database 1, the corresponding information of the middleware type metadata in the middleware category is: the middleware 2, corresponding information of cluster name metadata in the load category is: the load 1, the corresponding information of the application instance name metadata in the instance category is: example 2.
Then a first level target node will be obtained, namely: the primary node-subsystem, the secondary node-Chinese name, the associated connection is a meta node 1 of the database 1;
A second level target node, namely: primary node-logic entity category, secondary node-leCode, related element node 2 of entity 2 connected;
third level target node, namely: the primary node-database category, the secondary node-database entity name, the associated connection line is a database 1; and
The primary node-middleware type, the secondary node-middleware type and the associated connection line are middleware 2; and
Primary node-load category, secondary node-cluster name, and associated connection line is meta node 3 of load 1;
fourth level target nodes, namely: the primary node-instance category, the secondary node-application instance name, and the associated connection is meta-node 4 of instance 2.
S35: and arranging the target nodes according to the logical relation in the dimension table to form a node chain, and associating the target nodes positioned at the tail ends of the node chain with the data nodes to obtain the data index.
Illustratively, based on the above example, the meta-node 1, the meta-node 2, the meta-node 3 and the meta-node 4 are sequentially connected through the associated connection lines thereof, the associated connection line of the meta-node 4 is connected with the data node, and the target node is associated with the data node to form the data index of the data node.
S204: extracting meta-nodes in the data index;
extracting meta-categories in the meta-nodes, setting category input boxes for the meta-categories, constructing category index pages with the meta-categories and the category input boxes, and sending the category index pages for inputting the index keywords to a user side; or (b)
Extracting metadata in the metadata node, setting a metadata input box for the metadata, constructing a meta index page with the metadata and the metadata input box, and sending the meta index page for inputting the index keywords to a user side.
In the step, by providing the category index page for the user terminal, the user terminal can input the index keyword in the category input box, namely, according to the index keyword and the meta category (such as a sub-system category) corresponding to the category input box, the meta node corresponding to the index keyword is obtained, wherein the index keyword is consistent with or matched with the meta category in the meta node, and the identification efficiency of the index node is improved.
Meanwhile, by providing the meta index page for the user side, the user side can input index keywords in the meta input box, namely, meta nodes corresponding to the index keywords can be obtained according to the metadata (for example, chinese names) corresponding to the index keywords and the meta input box, wherein the index keywords are consistent with or matched with the metadata in the meta nodes, and the identification efficiency of the index nodes is improved.
S205: extracting meta nodes in the data index, extracting meta categories and meta data of the meta nodes, setting a selection frame for enabling and disabling the meta data for the meta data, constructing a search input frame associated with the selection frame, creating an optional index page with the meta categories, the meta data and the selection frame thereof and the search input frame associated with the selection frame, and sending the optional index page for inputting the index keywords to a user side.
In the step, by providing the selectable index page for the user terminal, the user terminal enters the index key words in the search input frame, and after enabling part of metadata and disabling another part of metadata, the metadata corresponding to the enabled part of metadata is traversed according to the index key words, so that the index nodes corresponding to the index key words are obtained, and the identification efficiency of the index nodes is improved.
S206: receiving index keywords sent by a user terminal, traversing meta nodes corresponding to the index keywords in the data indexes, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data as feedback data to the user terminal; the direct association refers to a situation that the meta node and the data node directly have an association relationship, and the indirect association refers to a situation that the meta node and the data node have an association relationship formed by other meta nodes or data nodes.
In the step, metadata is identified and obtained based on the index keywords, and original data corresponding to the metadata is used as feedback data, so that the problem that the calculation power of a server is excessively consumed due to all metadata in a convenient database and the corresponding original data is avoided, the obtaining efficiency of the feedback data is improved, and the feedback data related to the index keywords can be comprehensively fed back without controlling the quality of the index keywords, the comprehensiveness of the feedback data is guaranteed, and the feedback data obtained by a user side can meet the searching requirement.
S207: and receiving index keywords, index target words and index destination information sent by a user terminal, identifying meta nodes in the data index according to the index keywords, setting the meta nodes as index nodes, identifying meta nodes or data nodes in the data index according to the index destination words, setting the meta nodes or the data nodes as destination nodes, identifying the number of destination nodes directly associated and/or indirectly associated with the index nodes according to the index destination information, and sending the number to the user terminal.
In this step, the index keyword is an object to be retrieved by the user side, and the index target word identifies meta nodes or data nodes associated with the object and the number of the meta nodes or the data nodes, so that the user side can intuitively obtain the metadata distribution of each meta category, the original data distribution of each meta data, the original data distribution of each meta category, and the like in the graphic database according to the requirements of the user side, thereby expanding the application range of the graphic database and the data index thereof.
Illustratively, the received index key words, index target words, and index destination information are: how many pieces of original data exist in the system 1, wherein index keywords are "system 1", index target words are "original data", and index target information is "how many pieces", then, metanodes of "subsystem category-Chinese name-system 1" in the data index are set as index nodes, the number of data nodes directly associated and indirectly associated with the index nodes is calculated according to the index target information, and the number is sent to the user side.
If the received index key word and index target word are: the system 1 has more than one middleware 2, so the index keyword is "system 1", the index target word is "middleware 2" and the index destination information is "how many", then, the meta-node of the "subsystem category-Chinese name-system 1" in the data index is set as an index node, and the direct association and the indirect association between the meta-node and the index node are calculated according to the index destination information, and the "middleware category: middleware type: the number of meta-nodes of middleware 2″ and sends the number to the client.
In this embodiment, the receiving the index keyword and the index target word sent by the ue includes:
sending a search index page, or the selectable index page, or the category index page, or the meta index page to a user terminal, and receiving index keywords sent by the user terminal through the search index page, or the selectable index page, or the category index page, or the meta index page; the search index page is provided with an index input box, and is used for inputting the index keywords by the user side;
an index target page is sent to the user side, and an index target word sent by the user side through the index target page is received; the index target page is provided with a target input box, and the target input box is used for inputting the index target words by the user side.
Sending an index destination page to the user side, and receiving index destination information sent by the user side through the index destination page; the index destination page is provided with a destination input box, and the destination input box is used for inputting the index destination information by the user side.
In a preferred embodiment, the receiving the index keyword and the index target word sent by the user terminal includes:
S71: sending a search index page, or the selectable index page, or the category index page, or the meta index page to a user terminal, and receiving index information sent by the user terminal through the search index page, or the selectable index page, or the category index page, or the meta index page; the search index page is provided with an index input box, and is used for inputting the index information by the user side;
s72: identifying subject information in the index information through a preset natural language model, and setting the subject information as the index key word;
s73: identifying object information in the index information through the natural language model, and setting the object information as the index target word;
s74: and identifying predicate information in the index information through the natural language model, and setting the predicate information as the index destination information.
It should be noted that the natural language model is a natural language processing algorithm (NLP) having a syntax analysis algorithm for determining a syntax structure of a sentence or a dependency relationship between words in the sentence, wherein the syntax analysis algorithm is divided into two types, a syntax structure analysis and a dependency relationship analysis, so as to obtain a syntax structure of the whole sentence, which is called a complete syntax analysis, and a syntax analysis so as to obtain a local component, which is called a local analysis, which is called a dependency relationship analysis. The syntactic structure of the index information and the dependency relationship among the words are identified by the syntactic analysis algorithm, and the subject information, the predicate information and the object information are identified. In this embodiment, PCFG (context free grammar), lexical PCFG (dictionary-based PCFG), or transmission-based packing (greedy decision action based stitched syntax tree) may be employed as the syntax analysis algorithm.
Preferably, after the sending the number to the user terminal, the method further includes:
the number of target nodes is uploaded into the blockchain.
It should be noted that, the corresponding digest information is obtained based on the number of the target nodes, specifically, the digest information is obtained by hashing the number of the target nodes, for example, by using a sha256s algorithm. Uploading summary information to the blockchain can ensure its security and fair transparency to the user. The user device may download the summary information from the blockchain to verify whether the number of target nodes has been tampered with. The blockchain referred to in this example is a novel mode of application for computer technology such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Embodiment III:
referring to fig. 4, a data management and retrieval device 1 of the present embodiment includes:
the data input module 11 is configured to receive an extraction request sent by a user terminal, and acquire original data from an original database according to the extraction request;
the index construction module 13 is configured to store the original data in a preset graphic database and store the original data in the form of data nodes, arrange metadata of the original data to obtain a meta structure reflecting an association relationship between the metadata in the form of the meta nodes, and associate the meta structure with the data nodes to obtain a data index reflecting the association relationship between the original data and the metadata; wherein, the graph database is a non-relational database for storing the relational information among entities through graph theory;
the data retrieval module 16 is configured to receive an index keyword sent by a user terminal, traverse a meta node corresponding to the index keyword in the data index, set the meta node as an index node, identify a data node directly associated and/or indirectly associated with the index node, extract original data in the data node, and send the original data as feedback data to the user terminal; the direct association refers to a situation that the meta node and the data node directly have an association relationship, and the indirect association refers to a situation that the meta node and the data node have an association relationship formed by other meta nodes or data nodes.
Optionally, the data management and retrieval device 1 further includes:
the data cleaning module 12 is configured to perform data cleaning on the original data, and is configured to delete invalid data and metadata thereof, and missing values and metadata thereof in the original data.
Optionally, the data cleansing module 12 further includes:
an invalid cleaning unit 121, configured to identify an invalid value in the original data and query metadata corresponding to the invalid value, and delete the invalid value of the original data and the metadata thereof;
the deletion cleaning unit 122 is configured to identify a deletion value in the original data, query metadata corresponding to the deletion value, and delete the deletion value and the metadata thereof in the original data.
Optionally, the index building module 13 further includes:
a structural member unit 131, configured to establish a dimension data tree reflecting a logical relationship between meta-categories, and construct a structural table using the meta-categories as classification items according to the dimension data tree; wherein the meta category is category information for classifying metadata so that the metadata forms a hierarchical relationship;
a dimension construction unit 132, configured to extract metadata of original data in the graphic database, and enter the metadata and corresponding information thereof in the original data into a category item corresponding to the metadata in the structure table to obtain a dimension table;
A meta structure construction unit 133, configured to construct meta nodes representing the meta categories and metadata thereof according to the dimension table, and construct association relations between the meta nodes according to the logical relations between the meta categories to obtain a meta structure;
a node identifying unit 134, configured to extract metadata of original data in the graphic database and corresponding information of the metadata in the original data, obtain a meta node corresponding to the metadata from the meta structure, and set the meta node as a target node;
and the index construction unit 135 is configured to arrange the target nodes according to a logical relationship in the dimension table to form a node chain, and associate the target nodes located at the end of the node chain with the data nodes to obtain the data index.
Optionally, the data management and retrieval device 1 further includes:
a node page construction module 14, configured to extract meta nodes in the data index;
extracting meta-categories in the meta-nodes, setting category input boxes for the meta-categories, constructing category index pages with the meta-categories and the category input boxes, and sending the category index pages for inputting the index keywords to a user side; or (b)
Extracting metadata in the metadata node, setting a metadata input box for the metadata, constructing a meta index page with the metadata and the metadata input box, and sending the meta index page for inputting the index keywords to a user side.
Optionally, the data management and retrieval device 1 further includes:
the optional page construction module 15 is configured to extract meta nodes in the data index, extract meta categories and meta data thereof in the meta nodes, set a selection frame for enabling and disabling the meta data for the meta data, construct a search input frame associated with the selection frame, create an optional index page with the meta categories, the meta data and the selection frame thereof, and the search input frame associated with the selection frame, and send the optional index page for entering the index keyword to a user side.
Optionally, the data management and retrieval device 1 further includes:
the number retrieval module 17 is configured to receive an index keyword, an index target word, and index destination information sent by a user terminal, identify a meta node in the data index according to the index keyword, set the meta node as an index node, identify a meta node or a data node in the data index according to the index target word, set the meta node or the data node as a destination node, identify the number of destination nodes directly associated and/or indirectly associated with the index node according to the index destination information, and send the number to the user terminal.
Optionally, the number retrieval module 17 further includes:
the page sending unit 171 is configured to send a search index page, or the selectable index page, or the category index page, or the meta index page to a user side, and receive index information sent by the user side through the search index page, or the selectable index page, or the category index page, or the meta index page; the search index page is provided with an index input box, and is used for inputting the index information by the user side;
a key recognition unit 172 for recognizing subject information in the index information through a preset natural language model and setting the subject information as the index key;
a target recognition unit 173 for recognizing object information in the index information by the natural language model and setting the object information as the index target word;
an index destination unit 174 for identifying predicate information in the index information by the natural language model, and setting the predicate information as the index destination information.
The technical scheme is applied to the field of data processing of big data, the original data is obtained from an original database according to a received extraction request, the original data is stored in a preset graphic database and is stored in a data node form, metadata of the original data is arranged to obtain a metadata structure reflecting the association relation between the metadata in a metadata node form, the metadata structure is associated with the data node, and a data index reflecting the association relation between the original data and the metadata is obtained; receiving index keywords sent by a user side, traversing meta nodes corresponding to the index keywords in the data index, setting the meta nodes as index nodes, and identifying data nodes directly associated and/or indirectly associated with the index nodes, so that tree table query of data query is realized; and extracting the original data in the data node and sending the original data as feedback data to the user side.
Embodiment four:
in order to achieve the above objective, the present invention further provides a computer device 5, where the components of the data management and retrieving apparatus of the third embodiment may be dispersed in different computer devices, and the computer device 5 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack-mounted server, a blade server, a tower server, or a rack-mounted server (including a stand-alone server or a server cluster formed by a plurality of application servers), etc. The computer device of the present embodiment includes at least, but is not limited to: a memory 51, a processor 52, which may be communicatively coupled to each other via a system bus, as shown in fig. 5. It should be noted that fig. 5 only shows a computer device with components-but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead.
In the present embodiment, the memory 51 (i.e., readable storage medium) includes a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 51 may be an internal storage unit of a computer device, such as a hard disk or memory of the computer device. In other embodiments, the memory 51 may also be an external storage device of a computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like. Of course, the memory 51 may also include both internal storage units of the computer device and external storage devices. In this embodiment, the memory 51 is typically used to store an operating system installed in a computer device and various application software, such as program codes of the data management and retrieval device of the third embodiment. Further, the memory 51 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 52 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 52 is typically used to control the overall operation of the computer device. In this embodiment, the processor 52 is configured to execute the program code stored in the memory 51 or process data, for example, execute the data management and retrieval device, so as to implement the data management and retrieval methods of the first and second embodiments.
Fifth embodiment:
to achieve the above object, the present invention also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by the processor 52, performs the corresponding functions. The computer readable storage medium of the present embodiment is for storing a computer program for implementing the data management and retrieval method, and when executed by the processor 52, implements the data management and retrieval methods of the first and second embodiments.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (9)
1. A method of data management and retrieval, comprising:
receiving an extraction request sent by a user terminal, and acquiring original data from an original database according to the extraction request;
identifying invalid values in the original data, inquiring metadata corresponding to the invalid values, and deleting the invalid values and the metadata of the original data;
identifying a missing value in the original data, inquiring metadata corresponding to the missing value, and deleting the missing value and the metadata thereof in the original data; storing the original data into a preset graphic database and storing the original data in a data node form, arranging metadata of the original data to obtain a metadata structure reflecting the association relation between the metadata in a metadata node form, and associating the metadata structure with the data node to obtain a data index reflecting the association relation between the original data and the metadata; wherein, the graph database is a non-relational database for storing the relational information among entities through graph theory;
Receiving index keywords sent by a user terminal, traversing meta nodes corresponding to the index keywords in the data indexes, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data as feedback data to the user terminal; the direct association refers to the condition that the meta node and the data node directly have an association relationship, and the indirect association refers to the condition that the association relationship between the meta node and the data node is formed by other meta nodes or data nodes;
sending a search index page, or an optional index page, or a category index page, or a meta index page to a user side, and receiving index information sent by the user side through the search index page, or the optional index page, or the category index page, or the meta index page; the search index page is provided with an index input box, and is used for inputting the index information by the user side;
identifying subject information in the index information through a preset natural language model, and setting the subject information as the index key word;
Identifying object information in the index information through the natural language model, and setting the object information as the index target word;
identifying predicate information in the index information through the natural language model, and setting the predicate information as the index destination information;
and identifying meta nodes in the data index according to the index key words, setting the meta nodes as index nodes, identifying meta nodes or data nodes in the data index according to the index destination words, setting the meta nodes or the data nodes as destination nodes, identifying the number of destination nodes directly associated and/or indirectly associated with the index nodes according to the index destination information, and sending the number to the user side.
2. The data management and retrieval method according to claim 1, wherein said arranging metadata of said original data to obtain a meta structure reflecting an association relationship between each of said metadata in the form of a meta node includes:
establishing a dimension data tree reflecting the logical relation between meta-categories, and constructing a structure table taking the meta-categories as classification items according to the dimension data tree; wherein the meta category is category information for classifying metadata so that the metadata forms a hierarchical relationship;
Extracting metadata of original data in the graphic database, inputting the metadata and corresponding information thereof in the original data into a category item corresponding to the metadata in the structure table to obtain a dimension table;
and constructing meta nodes representing the meta categories and the meta data thereof according to the dimension table, and constructing association relations among the meta nodes according to the logic relations among the meta categories to obtain a meta structure.
3. The method for managing and retrieving data according to claim 2, wherein associating the meta structure with the data node to obtain a data index reflecting an association relationship between the original data and the meta data comprises:
extracting metadata of original data in the graphic database and corresponding information of the metadata in the original data, obtaining a meta node corresponding to the metadata from the meta structure, and setting the meta node as a target node;
and arranging the target nodes according to the logical relation in the dimension table to form a node chain, and associating the target nodes positioned at the tail ends of the node chain with the data nodes to obtain the data index.
4. The method for managing and retrieving data according to claim 1, wherein before receiving the index key words sent by the user terminal, the method further comprises:
Extracting meta-nodes in the data index;
extracting meta-categories in the meta-nodes, setting category input boxes for the meta-categories, constructing category index pages with the meta-categories and the category input boxes, and sending the category index pages for inputting the index keywords to a user side; or (b)
Extracting metadata in the metadata node, setting a metadata input box for the metadata, constructing a meta index page with the metadata and the metadata input box, and sending the meta index page for inputting the index keywords to a user side.
5. The method for managing and retrieving data according to claim 1, wherein before receiving the index key words sent by the user terminal, the method further comprises:
extracting meta nodes in the data index, extracting meta categories and meta data of the meta nodes, setting a selection frame for enabling and disabling the meta data for the meta data, constructing a search input frame associated with the selection frame, creating an optional index page with the meta categories, the meta data and the selection frame thereof and the search input frame associated with the selection frame, and sending the optional index page for inputting the index keywords to a user side.
6. The data management and retrieval method according to claim 3, wherein after said sending the number to the user terminal, the method further comprises:
the number of target nodes is uploaded into the blockchain.
7. A data management and retrieval device, comprising:
the data input module is used for receiving an extraction request sent by a user terminal and acquiring original data from an original database according to the extraction request;
an invalid cleaning unit, configured to identify an invalid value in the original data and query metadata corresponding to the invalid value, and delete the invalid value of the original data and the metadata thereof;
the deletion cleaning unit is used for identifying a deletion value in the original data, inquiring metadata corresponding to the deletion value and deleting the deletion value and the metadata thereof in the original data;
the index construction module is used for storing the original data into a preset graphic database and storing the original data in a data node form, arranging metadata of the original data to obtain a meta structure reflecting the association relation between the metadata in a meta node form, and associating the meta structure with the data node to obtain a data index reflecting the association relation between the original data and the metadata; wherein, the graph database is a non-relational database for storing the relational information among entities through graph theory;
The retrieval feedback module is used for receiving index keywords sent by a user side, traversing meta nodes corresponding to the index keywords in the data index, setting the meta nodes as index nodes, identifying data nodes directly associated and/or indirectly associated with the index nodes, extracting original data in the data nodes, and sending the original data to the user side as feedback data; the direct association refers to the condition that the meta node and the data node directly have an association relationship, and the indirect association refers to the condition that the association relationship between the meta node and the data node is formed by other meta nodes or data nodes;
the quantity retrieval module is used for receiving index keywords, index target words and index destination information sent by a user side, identifying meta nodes in the data index according to the index keywords, setting the meta nodes as index nodes, identifying meta nodes or data nodes in the data index according to the index destination words, setting the meta nodes or the data nodes as destination nodes, identifying the quantity of destination nodes directly related and/or indirectly related to the index nodes according to the index destination information, and sending the quantity to the user side;
The quantity retrieval module further includes:
the page sending unit is used for sending a search index page, or an optional index page, or a category index page, or a meta index page to the user side, and receiving index information sent by the user side through the search index page, or the optional index page, or the category index page, or the meta index page; the search index page is provided with an index input box, and is used for inputting the index information by the user side;
the key identification unit is used for identifying subject information in the index information through a preset natural language model and setting the subject information as the index key word;
a target recognition unit configured to recognize object information in the index information through the natural language model, and set the object information as the index target word;
and the index destination unit is used for identifying predicate information in the index information through the natural language model and setting the predicate information as the index destination information.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor of the computer device implements the steps of the data management and retrieval method of any one of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium having a computer program stored thereon, characterized in that the computer program stored on the readable storage medium, when executed by a processor, implements the steps of the data management and retrieval method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110724252.9A CN113434506B (en) | 2021-06-29 | 2021-06-29 | Data management and retrieval method, device, computer equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110724252.9A CN113434506B (en) | 2021-06-29 | 2021-06-29 | Data management and retrieval method, device, computer equipment and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113434506A CN113434506A (en) | 2021-09-24 |
CN113434506B true CN113434506B (en) | 2023-05-16 |
Family
ID=77757489
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110724252.9A Active CN113434506B (en) | 2021-06-29 | 2021-06-29 | Data management and retrieval method, device, computer equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113434506B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114168075B (en) * | 2021-11-29 | 2024-05-14 | 华中科技大学 | Method, equipment and system for improving load access performance based on data relevance |
CN115168661B (en) * | 2022-08-31 | 2022-12-02 | 深圳市一号互联科技有限公司 | Native graph data processing method, device, equipment and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110046062A (en) * | 2019-03-07 | 2019-07-23 | 佳都新太科技股份有限公司 | Distributed data processing method and system |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101433859B1 (en) * | 2007-10-12 | 2014-08-27 | 삼성전자주식회사 | Nonvolatile memory system and method managing file data thereof |
CN106682986A (en) * | 2016-12-27 | 2017-05-17 | 南京搜文信息技术有限公司 | Construction method of complex financial transaction network activity map based on big data |
CN111291152A (en) * | 2018-12-07 | 2020-06-16 | 北大方正集团有限公司 | Case document recommendation method, device, equipment and storage medium |
CN111008198B (en) * | 2019-11-22 | 2023-05-16 | 广联达科技股份有限公司 | Service data acquisition method and device, storage medium and electronic equipment |
CN111949831B (en) * | 2020-08-10 | 2023-08-08 | 中国工商银行股份有限公司 | Graphic database establishing method and device and readable storage medium |
CN111782824B (en) * | 2020-08-14 | 2024-04-19 | 中国工商银行股份有限公司 | Information query method, device, system and medium |
-
2021
- 2021-06-29 CN CN202110724252.9A patent/CN113434506B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110046062A (en) * | 2019-03-07 | 2019-07-23 | 佳都新太科技股份有限公司 | Distributed data processing method and system |
Also Published As
Publication number | Publication date |
---|---|
CN113434506A (en) | 2021-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9558196B2 (en) | Automatic correlation of dynamic system events within computing devices | |
US20120124063A1 (en) | Method and system for specifying, preparing and using parameterized database queries | |
US10073876B2 (en) | Bloom filter index for device discovery | |
CN107391502B (en) | Time interval data query method and device and index construction method and device | |
US11249975B2 (en) | Data archiving method and system using hybrid storage of data | |
KR20160124744A (en) | Systems and methods for hosting an in-memory database | |
AU2019349429B2 (en) | Translation of tenant identifiers | |
CN113434506B (en) | Data management and retrieval method, device, computer equipment and readable storage medium | |
CN111400393B (en) | Data processing method and device based on multi-application platform and storage medium | |
CN117171108B (en) | Virtual model mapping method and system | |
CN114139040A (en) | Data storage and query method, device, equipment and readable storage medium | |
CN110807028B (en) | Method, apparatus and computer program product for managing a storage system | |
CN110162412B (en) | Method and device for performing data operation on client | |
CN105843809B (en) | Data processing method and device | |
CN115705313A (en) | Data processing method, device, equipment and computer readable storage medium | |
CN113721856A (en) | Digital community management data storage system | |
AU2019350694B2 (en) | Identification of records for post-cloning tenant identifier translation | |
US20150269086A1 (en) | Storage System and Storage Method | |
US11138075B2 (en) | Method, apparatus, and computer program product for generating searchable index for a backup of a virtual machine | |
CN110968267A (en) | Data management method, device, server and system | |
US20240303073A1 (en) | Software recognition using tree-structured pattern matching rules for software asset management | |
CN117931740A (en) | Catalog metadata operation method, device, electronic equipment and readable storage medium | |
CN115658652A (en) | Offline data migration method and device, readable storage medium and equipment | |
JP2015204057A (en) | Data processor, data processing method, and program | |
CN114661829A (en) | Data analysis system and method based on K-MEANS clustering algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |