CN115687438A - Data asset operation method and device, electronic equipment and storage medium - Google Patents

Data asset operation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115687438A
CN115687438A CN202110823573.4A CN202110823573A CN115687438A CN 115687438 A CN115687438 A CN 115687438A CN 202110823573 A CN202110823573 A CN 202110823573A CN 115687438 A CN115687438 A CN 115687438A
Authority
CN
China
Prior art keywords
data
target
candidate
requirement
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110823573.4A
Other languages
Chinese (zh)
Inventor
胡皓月
杨文峰
杨涛
尹腾飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110823573.4A priority Critical patent/CN115687438A/en
Publication of CN115687438A publication Critical patent/CN115687438A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a data asset operation method, a data asset operation device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a plurality of service requirements; screening out a target data group corresponding to the service requirement from the data asset management model aiming at each service requirement; the target data group comprises first table level metadata and first field level metadata in the data asset management model; determining the matching degree between each business requirement and the corresponding target data group; determining the business requirements, in each business requirement, of which the occurrence frequency and the corresponding matching degree of each business requirement in a plurality of business requirements meet specific conditions, as asset optimization requirements; and optimizing the data asset management model based on the asset optimization requirement. According to the business requirements, a target data set is screened from the data asset management model, so that a demander can be helped to quickly find appropriate data; according to the business requirements, the data asset management model can be dynamically optimized, and the value of the data asset is improved.

Description

Data asset operation method and device, electronic equipment and storage medium
Technical Field
The application relates to the technical field of big data, and relates to but is not limited to a method and a device for operating data assets, electronic equipment and a storage medium.
Background
The popularization of mobile communication doubles the number of public mobile communication base stations, and a large amount of terminal signaling data is generated by utilizing the human communication activities performed by the base stations, and the data naturally meets the three-element characteristic of 'human-ground-time' due to the association of the position information of people and the base stations, is an important mobile operator data asset, and can be referred to as mobile position data for short. Compared with space-time position data of other sources, the data asset has the advantages of long-distance coverage and all-weather acquisition. However, the value of the mobile position data is not sufficiently mined because of the large amount of data.
Disclosure of Invention
In view of this, embodiments of the present application provide an operation method and apparatus for a data asset, an electronic device, and a storage medium.
In a first aspect, an embodiment of the present application provides an operation method for a data asset, where the method includes: acquiring a plurality of service requirements; aiming at each service demand, screening out a target data group corresponding to the service demand from a data asset management model; the target data group comprises first table level metadata and first field level metadata in the data asset management model; determining the matching degree between each service requirement and the corresponding target data group; determining the business requirements, in each business requirement, of which the occurrence frequency and the corresponding matching degree of each business requirement in a plurality of business requirements meet specific conditions, as asset optimization requirements; optimizing the data asset management model based on the asset optimization requirements.
In a second aspect, an embodiment of the present application provides an apparatus for operating a data asset, including: the acquisition module is used for acquiring a plurality of service requirements; the screening module is used for screening a target data group corresponding to each service requirement from the data asset management model according to the service requirement; the target data group comprises first table level metadata and first field level metadata in the data asset management model; the first determining module is used for determining the matching degree between each business requirement and the corresponding target data group; the second determining module is used for determining the business requirements, of which the occurrence frequency and the corresponding matching degree of each business requirement in the multiple business requirements meet specific conditions, as asset optimization requirements; and the optimization module is used for optimizing the data asset management model based on the asset optimization requirement.
In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program executable on the processor, and the processor executes the computer program to implement steps in an operation method of a data asset according to any of the embodiments of the present application.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the method for operating a data asset according to any of the embodiments of the present application.
In the embodiment of the application, the target data group can be screened out from the data asset management model according to the service requirements, so that a data demander can be helped to quickly find out the data suitable for use; according to the business requirements, the data asset management model can be dynamically optimized, and the value of the data asset can be improved.
Drawings
Fig. 1 is a schematic flow chart illustrating a method for operating a data asset according to an embodiment of the present application;
FIG. 2 is a schematic flow chart illustrating a method for building a data asset management model according to an embodiment of the present disclosure;
FIG. 3 is a schematic representation of data table modeling of a data asset management model according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating a method for managing a round-robin operation of data assets according to an embodiment of the present application;
fig. 5 is a flowchart illustrating a detailed implementation of the loop operation management method shown in fig. 4;
fig. 6 is a schematic structural diagram of a component of an operation device for a data asset according to an embodiment of the present application;
FIG. 7 is a schematic structural diagram illustrating an apparatus for building a data asset management model according to an embodiment of the present application;
fig. 8 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solution of the present application is further elaborated below with reference to the drawings and the embodiments.
Fig. 1 is a schematic flowchart of an operation method of a data asset according to an embodiment of the present application, and as shown in fig. 1, the method includes:
step 102: acquiring a plurality of service requirements;
the service requirement may also be referred to as a data application requirement and a data requirement, which is referred to as a requirement for short, and the service requirement may be a data reading and writing requirement of a data demanding party; for example, the service requirement may be a daily pedestrian volume and monthly peak value of each scenic spot in west ampere city every month in 2020; the business requirement may also be a preferred shopping period within one day for male customers in a certain mall of the anserary area in west ann.
Step 104: aiming at each business demand, screening out a target data group corresponding to the business demand from a data asset management model; the target data group comprises first table level metadata and first field level metadata in the data asset management model;
the data asset management model can be a data model used for managing and maintaining data assets, and is a base of a database; the data assets are data resources which are owned or controlled by individuals or enterprises, can bring future economic benefits to the enterprises and are recorded in a physical or electronic mode; the data resource may be metadata of terminal signaling data; the data asset management model is used for managing metadata of the terminal signaling data through a data table; the data table may be a logical table.
The data asset management model may include a data warehouse layer, a data mart layer, and a data application layer; the data warehouse layer can store preprocessed and well organized moving position detail data; the data mart layer can store the summarized label data which are processed and processed by the data of the data warehouse layer and can be directly called by the application; the data application layer can store data processed by the data of the data warehouse layer and/or the data mart layer; the data application layer may store an intermediate calculation process table of common call requirements.
The terminal signaling data may be communication data between the terminal user and the transmitting base station or the micro station; the terminal signaling data may also be referred to as mobile location data; the metadata of the terminal signaling data may be data for describing the terminal signaling data, and is mainly used for describing information of data attributes of the terminal signaling data, and is used for supporting functions such as indicating a storage location, history data, resource searching, file recording and the like; when data asset management is performed, the types of metadata of the terminal signaling data which can be related are various, such as technical metadata and service metadata of the terminal signaling data, and the data asset management model can manage the service metadata of the terminal signaling data through a data table.
The data warehouse layer, the data mart layer and the data application layer comprise a plurality of data tables; each data table comprises table level metadata and field level metadata. The table level metadata may include the belonging hierarchy, the traffic scenario, the data age, and the standard latency (also may be referred to as data latency); the field level metadata may include a category, a range, and a granularity; wherein the category can be a three-element keyword (e.g., can be a person, a place, or a time), a tag attribute, other, etc.; the range may be a region range, a time span (also referred to as a time range), a person range (divided by number attribution, some type of attribute), a label value (e.g., a professional label including label values of students, officers, teachers, etc.); the granularity may be location granularity, time granularity, or people granularity.
Step 106: determining the matching degree between each service requirement and the corresponding target data group;
referring to step 104, a target data set may be screened from the data asset management model according to each of the service requirements, so that a matching degree between each of the service requirements and the corresponding target data set may be determined, and the occurrence frequency of each of the service requirements may be counted.
Step 108: determining the business requirements, in each business requirement, of which the occurrence frequency and the corresponding matching degree of each business requirement in a plurality of business requirements meet specific conditions, as asset optimization requirements;
it should be noted that, in an embodiment, at least one service requirement whose occurrence frequency and corresponding matching degree satisfy a specific condition may be determined as an asset optimization requirement.
In another embodiment, the service requirements can be classified according to the similarity between the service requirements, the frequency of occurrence of each type of service requirements (i.e., the frequency of occurrence of similar service requirements) is determined, and at least one type of service requirement is determined from the service requirements according to the frequency of occurrence of similar service requirements and the matching degree between the similar service requirements and the corresponding target data group, and is used as at least one asset optimization requirement; each type of service requirement can comprise at least one service requirement; suppose a certain service requirement item is D i The corresponding target data group of the service requirement item is G a ,D i And G a The matching degree between the two is M i ,D i The frequency of occurrence of (A) is m, D i Of the same kind as F i ,F i Is n, G may be a As F i Corresponding target data set if F i And G a The matching degree between is N i Then the frequency of occurrence of the service requirement can be determinedM + n, the matching degree between the service requirement and the corresponding target data set is M i +N i Then, at least one asset optimization requirement can be determined according to the occurrence frequency of the service requirements and the corresponding matching degree.
Step 110: optimizing the data asset management model based on the asset optimization requirements.
In the embodiment of the application, the target data group can be screened out from the data asset management model according to the service requirements, so that a data demander can be helped to quickly find out the data suitable for use; the asset optimization requirement can be determined according to the business requirement, the data asset management model can be dynamically optimized, and the value of the data asset can be improved.
The embodiment of the application also provides an operation method of the data assets, which comprises the following steps:
step S202: acquiring a plurality of service requirements;
for easy understanding, three requirement examples can be listed for illustration: service requirement 1: calculating the daily pedestrian volume and monthly peak value of each scenic spot in the huthalamus area of Suzhou city in 2019 each month; service requirement 2: the escape route of a certain lawbreaker in China is sought; service requirement 3: seeking a preferred shopping period within one day for a female customer in a certain market in the sunny region in Beijing city in 7 months; the business requirements include target table level metadata and target field level metadata.
Step S204: screening a first candidate data table from the data asset management model according to the target table level metadata of each service requirement;
wherein, the target table level metadata comprises an atomic scene (namely a target business scene) to which the business requirement belongs and a timeliness requirement to which the business requirement belongs; the atomic scenario may also be referred to as a business scenario; the timeliness requirements include an timeliness category parameter (i.e., target data timeliness) and a target data latency, whether data timeliness is real-time or offline. And querying various table-level metadata in the data asset management model according to the atomic scene in service requirements and the aging category parameters in data aging, and screening all data tables meeting the conditions as first candidate data tables.
In one embodiment, the first candidate data table may be screened from the data asset management model according to a target level of each of the service demands, a target service scenario, and a target data aging.
The required atomic scene can be selected according to the business requirements. Each business requirement may be composed of one or more atomic scenarios; the atomic scene may include: scene 1: knowing the area and time range, finding people (groups); scene 2: knowing people (crowd) and time range, query location/area; scene 3: the known people (group) and area scope, query time (period topical report); scene 4: looking up tag attributes (gender, age, occupation, places of daily living) about a person; scene 5: find tag attributes about an area (location); scene 6: look up the "people-ground relationship" tag attribute for the time period. Aiming at the three requirements, the atom scenes respectively selected are as follows: service requirement 1 corresponds to scenario 1; service requirement 2 corresponds to scenario 2; the service requirement 3 corresponds to a combination of the scenarios 1 and 4.
And then according to the business demand, selecting the timeliness requirement of each demand on the data, wherein the timeliness requirement of the business demand on the data comprises two aspects: on one hand, the data is processed in real time to obtain results or processed in batch in an off-line manner, and the real-time effect category is real time or off-line; on the other hand, what is the maximum allowed data delay; in one embodiment, service requirement 2 requires real-time data, meets the current ongoing instant query, and has as small an expectation as possible for data delay, which is allowed for 10 minutes at maximum; the business requirement 1 and the business requirement 3 are mainly counted aiming at historical data, and only offline data are called for calculation; the service requirement 3 only concerns data of one month, the data application is disposable, and for the time spent by the data reaching the application calling party, the two parties can be controlled through the duty communication when the data application is in butt joint with authorization. For the business requirement 1, it is obviously necessary to generate a daily peak traffic data statistical report of the previous month every month, and after a month is finished, the report can be taken to the next day of the next month to reflect the requirement of data delay, assuming that the completion is expected before the month from the management perspective, that is, the maximum allowed value of data delay may be 15 days.
Step S206: screening a second candidate data table from the first candidate data table according to the target field level metadata of each service requirement, and screening candidate metadata from the second candidate data table;
wherein, the target field level metadata of the business requirement is a requirement parameter of the business requirement, and the requirement parameter may include a category (also may be referred to as a target category), a range (also may be referred to as a target range), and a granularity (also may be referred to as a target granularity); the candidate metadata is metadata related to the target field level metadata in the second candidate data table; the tables meeting the requirement parameters are screened out from the first candidate data table by matching the metadata attributes of each field according to each requirement parameter in the service requirement, and the tables meeting the requirement parameters after relevant data elements are selected out and irrelevant data elements are deleted are used as second candidate data tables. When data element matching is carried out, range type metadata needs to meet the requirement that the range is smaller than or equal to the data range; the demand granularity is greater than or equal to the data granularity.
In one embodiment, a second candidate data table may be screened from the first candidate data table according to a target category, a target range, and a target granularity of each of the service demands, and candidate metadata may be screened from the second candidate data table; the candidate metadata is metadata in the second candidate data table related to the target category, target range, and target granularity.
In the case where the category is a place, the range may be a region range, and the granularity may be a region granularity; the minimum area unit required may be selected as the area granularity, for example: base station/cell, custom area, province, city, district/county; the maximum range of regions that need to be covered may be selected according to the region granularity, for example: national, province, city, district/county, etc.
Where the category is time, the range may be a time range and the granularity may be a time granularity; the minimum time unit of demand can be selected as the time granularity; the time granularity is provided for the user to choose according to the actual table information of the bins, and the selectable time granularity is as follows: year, month, day, hour, 15 minutes; the maximum slot length (i.e., time range) that the demand covers can be selected according to the time granularity, for example: year 2020, month 1 to year 2020, month 9.
In the case that the category is a person, the range may be a task range, and the granularity may be a person granularity; the smallest personality unit of demand may be selected as the personality granularity, for example: people, teams, organizations; the range of people covered by the requirement can be selected according to the granularity of people, for example: all mobile users in Jiangsu province.
In addition, for a certain label requirement (service requirement of the label class), a label class with a matched requirement can be selected from the label library in the data mart layer, or the label class which is not in the label library can be filled in a supplementary mode. The range corresponding to the tag requirement is a tag range, and for each tag category, the corresponding required tag values are listed, for example: for the age group tags, the range of tags involved in the demand is limited to only 60-80 years old. Table 1 is a table of demand parameters for business requirements 1 through 3, and the configurable demand parameters are shown in table 1:
TABLE 1
Business requirements Business requirement 1 Business requirement 2 Business requirement 3
Region granularity District/county Base station cell Base station cell
Extent of area Huthalamus region Nationwide Market in sunny district
Time granularity Day(s) day Hour(s) Hour(s)
Time range Year 2019 7 x 24 hours 7 months and 24 hours a day
Character granularity Human being Human being Human being
Range of people Without limitation There may be a list of key people Without limitation
Label categories Is free of Is composed of Sex, group
Extent of label Is free of Is free of Customer in women and shopping mall
In summary, according to the target field-level metadata of the service requirement (i.e. the target category, the target range, and the target granularity), a second candidate data table may be screened from the first candidate data table, candidate metadata may be screened from the second candidate data table, and irrelevant metadata may be deleted from the second candidate data table.
Step S208: arranging and combining the second candidate data table to obtain a plurality of candidate data groups;
the second candidate data table after the irrelevant data elements are deleted can be freely arranged and combined to obtain a plurality of candidate data groups, and all table-level metadata and field-level metadata in each candidate data group can meet service requirements; a second candidate list may be repeatedly presented in the plurality of candidate data sets.
Step S210: determining a target data set from the plurality of candidate data sets according to the data delay and the data hierarchy of the second candidate data table of each of the plurality of candidate data sets;
wherein the data hierarchy may be one of a data warehouse tier, a data mart tier, and a data application tier.
Step S212: determining the matching degree between each business requirement and the corresponding target data group;
the matching degree may be the number of field-level metadata that is consistent between the target field-level metadata in the service requirement and the field-level metadata in each target data table of the target data group.
Step S214: determining the business requirements, in each business requirement, of which the occurrence frequency and the corresponding matching degree of each business requirement in a plurality of business requirements meet specific conditions, as asset optimization requirements;
step S216: optimizing the data asset management model based on the asset optimization requirements.
In the embodiment of the application, as the terminal signaling data is from the field of communication operation, the field format of the terminal signaling data has certain professional uniqueness, and the frequency interval for generating the data is rich in practical significance. The data table is screened according to the target table level metadata in the service requirement, and then the data elements are matched according to the target field level metadata, so that the data in the screened target data group is more accurate and meets the data requirement of a user.
The embodiment of the application also provides an operation method of the data assets, which comprises the following steps:
step S302: acquiring a plurality of service requirements;
step S304: screening a first candidate data table from the data asset management model according to the target table level metadata of each service requirement;
step S306: screening a second candidate data table from the first candidate data table according to the target field level metadata of each service requirement, and screening candidate metadata from the second candidate data table;
step S308: the second candidate data table is arranged and combined to obtain a plurality of candidate data groups;
wherein it is assumed that the plurality of candidate data groups includes a candidate data group G a ,G a Including a second candidate data Table Table 1 To Table 5 Then G is a ∈{Table 1 ,Table 2 ,Table 3 ,Table 4 ,Table 5 }。
Step S310: determining a data level corresponding to each second candidate data table in each candidate data group;
step S312: determining a first recommendation index according to the number of second candidate data tables in each data level and the level coefficient of the corresponding data level;
wherein, assume I a1 A first recommendation index is represented that indicates a first recommendation index,
Figure BDA0003172766430000071
representing a level coefficient of L in the candidate data set x A table of (a);
Figure BDA0003172766430000072
representing a level coefficient of L in a candidate data set x The first recommendation index may be expressed by equation (1):
Figure BDA0003172766430000073
step S314: determining a second recommendation index according to the data time delay of each second candidate data table in each candidate data group;
wherein, assume I a2 Representing the second recommendation index, delay represents the data Delay of the second candidate data table, and the second recommendation index can be represented by equation (2):
Figure BDA0003172766430000081
step S316: determining a target data set from the plurality of candidate data sets according to each of the first recommendation indexes and the corresponding second recommendation index;
wherein, assume I a Represents the sum of the first recommendation index and the corresponding second recommendation index, then I a Can be expressed by equation (3):
Figure BDA0003172766430000082
step S318: determining the matching degree between each business requirement and the corresponding target data group;
step S320: determining the business requirements, in each business requirement, of which the occurrence frequency and the corresponding matching degree of each business requirement in a plurality of business requirements meet specific conditions, as asset optimization requirements;
step S322: optimizing the data asset management model based on the asset optimization requirements.
In the embodiment of the application, the target data group is determined from the candidate data groups according to each first recommendation index and the corresponding second recommendation index, so that the determination of the target data group can be more accurate.
The embodiment of the application also provides an operation method of the data assets, which comprises the following steps:
step S402: acquiring a plurality of service requirements;
step S404: screening a first candidate data table from the data asset management model according to the target table level metadata of each service requirement;
step S406: screening a second candidate data table from the first candidate data table according to the target field level metadata of each service requirement, and screening candidate metadata from the second candidate data table;
step S408: arranging and combining the second candidate data table to obtain a plurality of candidate data groups;
step S410: determining a data level corresponding to each second candidate data table in each candidate data group;
step S412: determining a first recommendation index according to the number of second candidate data tables in each data level and the level coefficient of the corresponding data level;
step S414: determining a second recommendation index according to the data time delay of each second candidate data table in each candidate data group;
step S416: determining the candidate data group with the minimum sum of the first recommendation index and the second recommendation index as a target data group;
two factors of data time delay and data hierarchy can be considered in selecting the target data group from the candidate data groups; the shorter the data delay, the closer the data hierarchy to which the data belongs to the upper application, the more preferred the recommendation, therefore, see formula (3), one can refer to I a And determining the minimum candidate data set as a target data set.
Hypothesis candidate data set G a ∈{Table 1 ,Table 2 ,Table 3 ,Table 4 ,Table 5 And Table 1 And Table 2 Table, which is a Table at the data mart level 3 Being tables, of a data warehouse layer 4 And Table 5 Being tables, of the data application layer 1 To Table 5 The data Delay of (1) is in turn Delay 1 To Delay 5 That is, the number of tables at the data warehouse level is 1, and the number of tables at the data mart level and the data application level is 2, the candidate data group G a Index of recommendation I a Can be expressed by equation (4):
I a =L 1 *1+L 2 *2+L 3 *2+Delay 1 +Delay 2 +Delay 3 +Delay 4 +Delay 5 (4);
then can be at I a At minimum, G a And selecting as a target recommended data set.
Step S418: determining the matching degree between each business requirement and the corresponding target data group;
step S420: determining the service requirement with the highest occurrence frequency and the lowest matching degree in the service requirements as an asset optimization requirement;
the higher the frequency of occurrence of the service requirements or the similar service requirements is, the more urgent is the establishment of an intermediate process table meeting the service requirements; the greater the difference in the matching degree between each service requirement and the corresponding target data group, the lower the degree that the table in the existing MID (data mart layer) or APP layer (data application layer) meets the service requirement, so that the service requirement with the highest frequency and the lowest matching degree can be determined as the asset optimization requirement.
In addition, because the frequency of occurrence of the plurality of business requirements may not be high, or the matching degree is not low, even if the business requirement with the highest frequency of occurrence and the lowest matching degree is selected from the plurality of business requirements, the requirement for constructing the intermediate process table of the business requirement is not urgent enough, and the degree of meeting the business requirement by the table in the existing MID (data mart layer) or APP layer (data application layer) is high, so that the contribution of the selected asset optimization requirement to the optimization of the data asset management model is not large, and therefore, whether at least one business requirement with the frequency of occurrence greater than the frequency threshold value or with the matching degree smaller than the matching degree threshold value exists in the plurality of business requirements can be determined, if so, the at least one business requirement is determined as a target business requirement, and the business requirement with the highest frequency of occurrence and the lowest matching degree in at least one target business requirement is determined as the asset optimization requirement.
Step S422: establishing a corresponding first data table based on the asset optimization requirement;
wherein the first data table may be established according to target table level metadata and target field level metadata in the asset optimization requirements.
Step S424: adding a first data table to the data asset management model.
In the embodiment of the application, two factors, namely data delay and data hierarchy, are considered, and the candidate data group which is short in delay and close to the upper application in the data hierarchy is preferentially recommended, so that on one hand, non-detailed data (namely data of a data mart layer or a data application layer) can be recommended to a user as far as possible, the user can be prevented from accessing sensitive detailed data of a data warehouse layer, and data safety is guaranteed; on the other hand, the reading efficiency of the data can be improved; in addition, the business requirement with the highest frequency of occurrence and the lowest matching degree is determined as the asset optimization requirement, and the first data table is established and added into the data asset management model based on the asset optimization requirement, so that the matching degree between the business requirement with urgent requirements and the data in the data asset management model can be improved.
The embodiment of the application also provides an operation method of the data assets, which comprises the following steps:
step S502: acquiring a plurality of service requirements;
step S504: screening a first candidate data table from the data asset management model according to the target table level metadata of each service requirement;
step S506: screening a second candidate data table from the first candidate data table according to the target field level metadata of each service requirement, and screening candidate metadata from the second candidate data table;
step S508: the second candidate data table is arranged and combined to obtain a plurality of candidate data groups;
step S510: determining a data level corresponding to each second candidate data table in each candidate data set;
step S512: determining a first recommendation index according to the number of second candidate data tables in each data level and the level coefficient of the corresponding data level;
step S514: determining a second recommendation index according to the data time delay of each second candidate data table in each candidate data group;
step S516: sorting the plurality of candidate data sets according to each first recommendation index and the corresponding second recommendation index;
wherein the plurality of candidate data sets may be sorted according to a sum of the first recommendation index and the second recommendation index, e.g. according to I a And sorting from small to large.
Step S518: determining a plurality of first data groups from the plurality of candidate data groups according to the sorting result;
wherein I can be selected from a plurality of candidate data sets a The top several candidate data sets are used as the first data set.
Step S520: in response to the received selection instruction, determining a target data group from the plurality of first data groups;
wherein, the first data set can be selected by the user; the user can select a target data group from the first data group, and the processor generates a selection instruction after receiving the selection operation to determine the target data group from the first data group.
Step S522: determining the matching degree between each business requirement and the corresponding target data group;
step S524: determining the priority of the corresponding service requirement according to the occurrence frequency of each service requirement in a plurality of service requirements and the corresponding matching degree;
the higher the occurrence frequency and the lower the corresponding matching degree, the higher the priority of the corresponding service requirement may be.
Step S526: determining the service requirement with the priority meeting the specific condition as an asset optimization requirement;
the business requirements can be sequenced from high to low according to the priority, and a plurality of business requirements with the top priority are determined as asset optimization requirements; in addition, the service demands can be sorted according to the sequence of the priority levels from high to low, several service demands with the top priority levels are determined as target service demands, and the target service demands are determined as asset optimization demands under the condition that the occurrence frequency of the target service demands is greater than a frequency threshold value or the matching degree is smaller than a matching degree threshold value.
Step S528: updating first table level metadata and first field level metadata of a corresponding target data table in the data asset management model based on the target table level metadata and the target field level metadata of each of the at least one asset optimization requirement.
In the embodiment of the application, the candidate data sets can be ranked according to the first recommendation index and the second recommendation index, and the candidate data sets ranked in the front are provided for the user to select, so that the autonomy and the flexibility of the user for selecting the target data set can be improved; the priority of the business requirements is determined according to the occurrence frequency and the matching degree of the business requirements, the multiple asset optimization requirements are determined according to the priority sequence, and the data asset management model is updated according to the metadata in the multiple asset optimization requirements, so that the diversity and the flexibility of the determination of the asset optimization requirements are improved, and the flexibility of the updating mode of the data asset management model is also improved.
Fig. 2 is a schematic flowchart of a method for establishing a data asset management model according to an embodiment of the present application, and as shown in fig. 2, the method includes:
step 202: respectively establishing data tables stored in a data warehouse layer, a data mart layer and a data application layer according to the three elements of the terminal signaling data; the three elements comprise user identification information, user position information and time points;
the data warehouse layer is used for storing detailed data in metadata of the preprocessed terminal signaling data; the detailed data matches the user position information with the user identification information and the time point, or matches the user identification information with the user position information and the time point; the data mart layer is used for storing tag data in the metadata of the terminal signaling data processed by the data warehouse layer; the tag data is divided into user identification information, user position information and time points according to themes; the data application layer is used for storing a commonly used intermediate calculation process table in the metadata of the terminal signaling data processed by the data warehouse layer or the data mart layer; the data table may be used to characterize logical relationships between data, and the data table may be a logical table.
The user identification information may also be referred to as a person or a group of people, and may be various user IDs (Identity identifiers), the user location information may also be referred to as a location, and may be various region codes, such as codes of provinces, cities, cells, and the like, and the time point may also be referred to as time, and may also be a time period, such as a holiday code (e.g., a code corresponding to a spring festival), a report type code, and the like.
Step 204: establishing table-level metadata and field-level metadata corresponding to each data table according to the three elements of the terminal signaling data;
step 206: establishing the data asset management model comprising the data warehouse layer, the data mart layer, and the data application layer.
In the embodiment of the application, the data asset management model is established through the hierarchical and fractional data tables, the metadata of the signaling data of the terminal can be more efficiently managed and maintained, and because the logical table is established and represents the logical relationship among the data, a data demander can be helped to quickly understand the data.
The embodiment of the application also provides a method for establishing a data asset management model, which comprises the following steps:
step S602: in the data warehouse layer, establishing a plurality of basic real-time data tables according to the user identification information and the user position information of the terminal signaling data; establishing an offline fact data table according to the user identification information, the user position information and the time point of the terminal signaling data; establishing other data tables related to the basic real-time data table and the offline fact data table according to the basic real-time data table and the offline fact data table;
step S604: in a data mart layer, summarizing data in a plurality of basic real-time data tables, the offline fact data tables and other data tables in the data warehouse layer, and establishing a summarized label data table comprising a plurality of label data;
step S606: in a data application layer, processing data in a plurality of basic real-time data tables, the offline fact data tables and other data tables in the data warehouse layer and/or summary label data tables in the data mart layer, and establishing a summary granularity data table containing a plurality of granularity data;
step S608: establishing table-level metadata and field-level metadata corresponding to each data table according to the three elements of the terminal signaling data;
the table-level metadata comprises a belonged level, a service scene, data timeliness and data time delay; the field level metadata includes a category, a range, and a granularity.
Step S610: establishing the data asset management model comprising the data warehouse layer, the data mart layer, and the data application layer.
In the embodiment of the application, the summary granularity data table is placed into the data application layer, the summary label data table is placed into the data mart layer, and the basic real-time data table and the offline fact data table are placed into the data warehouse layer, so that data demanders can be helped to more conveniently call intermediate layer data from the data application layer, repeated development cost is reduced, and data processing time is shortened.
The embodiment of the application also provides a method for establishing a data asset management model, which comprises the following steps:
step S702: selecting at least one service scene from pre-established service scenes as the service scene of each data table according to the retrieval relationship among the user identification information, the user position information and the time point of the terminal signaling data;
the search relationship may be that user location information is searched according to user identification information and a time point, or user identification information is searched according to user location information and a time point, data tables may be modeled in the data warehouse layer, the data mart layer, and the data application layer, respectively, and before modeling of the data tables, business scenarios may be modeled, where six possible data application atomic scenarios (which may also be referred to as business scenarios) are enumerated according to three elements of "person-ground-time", that is, scenarios 1 to 6 in step S204.
Step S704: respectively establishing data tables stored in a data warehouse layer, a data mart layer and a data application layer according to the three elements of the terminal signaling data;
FIG. 3 is a schematic diagram of data table modeling of a data asset management model according to an embodiment of the present application; referring to fig. 3, in data sheet modeling, the data warehouse layer 301, considers [ people + time: location ] and [ location + time: people ] quick search requirements of common scenes are searched by users and positions respectively to generate two basic real-time tables: a user trajectory sequence table 3011 and a base station (cell) slice table 3012; the data mart layer 302 mainly stores various light summary label tables with people or places as key values; the data application layer 303 stores a summary of the various granularities that most applications may call to facilitate direct application calls.
Step S706: establishing a belonged level of each data table according to a belonged level of user identification information, a belonged level of user position information and a belonged level of a time point of the terminal signaling data; the affiliated level is one of the data warehouse level, the data mart level, and the data application level;
step S708: selecting at least one service scene from the service scenes as the service scene of each data table according to the retrieval relation among the user identification information, the user position information and the time point of the terminal signaling data;
the service scenario may be one scenario or a combination of scenarios 1 to 6.
Step S710: establishing data timeliness of each data table according to the category of the time point of the terminal signaling data; the data aging is an off-line meter or a real-time meter;
wherein, the data aging can be an off-line table or a real-time table; the category of the time point comprises real-time and off-line, and is used for representing whether the data is processed in real time to output results immediately or processed in an off-line manner to output results in batches; the data Delay may record an average processing time (also referred to as an average processing duration) from the raw data to the current table for the Delay function.
Step S712: establishing data time delay of each data table according to the maximum value of the time point of the terminal signaling data; the data time delay is the average processing time from the original data to the corresponding data table; the maximum value of the time point may represent a maximum value allowed by the data delay.
Step S714: establishing a person type, a place type, a time type and a label attribute of each data table according to the type of the user identification information of the terminal signaling data, the type of the user position information and the type of the time point;
the person category of the data table may be determined according to the category to which the user identification information belongs, the location category of the data table may be determined according to the category to which the user location information belongs, and the time category of the data table may be determined according to the category to which the time point belongs.
Step S716: establishing a person range, a place range, a time range and a label value of each data table according to the affiliated range of the user identification information, the affiliated range of the user position information and the affiliated range of the time point of the terminal signaling data;
the affiliated range of the user identification information may be unlimited or a list of key personnel, the affiliated range of the user location information may be a certain market in the country, the XX area and the YY area, and the affiliated range of the time point may be 2019, the whole year, 7 days and the like.
Step S718: and establishing the figure granularity, the place granularity and the time granularity of each data table according to the minimum unit of the user identification information, the minimum unit of the user position information and the minimum unit of the time point of the terminal signaling data.
The person granularity may be a person, the location granularity may be a base station cell, a district/county, etc., and the time granularity may be a day, an hour, etc.
It should be noted that, each field of each table corresponds to one data element, and the permission granularity for data opening can be controlled to the field level, so that the metadata information of each field is very important for the data user. For business features of a mobile location data asset, the business metadata (also referred to as field-level metadata) of the specification field may include: category, range, and granularity; wherein the category can be a three-element keyword (e.g., can be a person, a place, or a time), a tag attribute, other, etc.; the range may be a region range, a time span (also referred to as a time range), a person range (divided by number attribution, some type of attribute), a label value (e.g., a professional label including label values of students, officers, teachers, etc.); the granularity may be location granularity, time granularity, or people granularity. Table 2 is a location signaling pull-linked list of all male users in full month of 2020, 6 months in the area range of Jiangsu province, and the configurable field-level service metadata is shown in table 2:
TABLE 2
Figure BDA0003172766430000141
Figure BDA0003172766430000151
Wherein, referring to table 2, imsi (International Mobile Subscriber Identity) is a SIM (Subscriber Identity Module) card number; the MSISDN is a number that a calling user needs to dial to call a Mobile user in a GSM (Global System for Mobile Communications ) PLMN (Public Land Mobile Network), and is used as a fixed Network PSTN (Public Switched Telephone Network) number, and is a number that can uniquely identify the Mobile user in a Public Switched Telephone Network numbering plan, that is, a Mobile phone number of a called user; from _ count is a country code; from _ Source is the registration place of the mobile phone number and is the first 7 digits of the valid mobile phone number; longitude denotes Longitude; latitude denotes Latitude; TAC (Tracking Area Code) is TAC of the current cell; cell ID (Cell Identity), also known as Cell Identity, is the current Cell's ECI (E-UTRAN Cell Identity, E-UTRAN Cell unique Identity), determines user location by identifying which Cell in the network transmits a user call and translating this information into latitude and longitude; the geohash coding is to code the longitude and latitude by using a geohash algorithm, change two dimensions into one dimension and partition an address position; provinceCode represents province code; city represents the City of the base station; timestamp represents a time stamp generated by the signaling; day represents Day; the PROCEDURE TYPE/Event ID (Event Identity) represents a PROCEDURE TYPE or Event code; gen represents signaling traceablility.
Step S720: establishing the data asset management model comprising the data warehouse layer, the data mart layer, and the data application layer.
In the embodiment of the application, table-level metadata and field-level metadata corresponding to each data table are established according to the three elements of the terminal signaling data, so that the metadata of the terminal signaling data can be managed and maintained more efficiently.
In the related art, although some methods for managing mobile location data have been proposed, there are technical disadvantages as follows:
first, the above approach is limited to functional optimization at the database level, not data management at the data asset level.
The method aims at the design of a time sequence and space-time database, belongs to the function optimization of a database layer, and focuses on solving the problem of one or a class of application requirements, but not solving the problem of the maximization of the value of a data asset manager for the multi-scene application use of the data asset.
Second, the uniqueness of the mobile location data does not address the difficulties of data users understanding the data.
Since the mobile location data comes from the communication operation field, the field format has a certain professional uniqueness, and the frequency interval of the data generation is rich in practical significance. The method focuses on the storage optimization of general spatio-temporal data at a physical level, but lacks the knowledge display of data connotation at a logical level, and cannot effectively help data users to understand the data and rapidly develop and use the data.
Moreover, the above method cannot solve the problem of designing and using a large amount of intermediate layer data in data asset management.
The method cannot solve the problem of difficult design of a data asset manager for data hierarchical management; the data asset manager cannot be helped to decide which intermediate process data exist in a logic table form and which need to be stored in a disk-down manner; it does not help the data consumer to select which layer of data can efficiently comply with the business requirements and minimize data delay.
Finally, the above method cannot guide dynamic adjustment of data asset management policies according to data usage requirements.
The method cannot help a data asset manager to dynamically and scientifically adjust the data management strategy according to the requirement of upper-layer application on data use, and the data value is continuously improved.
In order to solve the above problem, an embodiment of the present application provides a mobile location data asset operation method based on a dynamic feedback mechanism. The method establishes a set of data asset management model and a data application demand collection, evaluation and optimization feedback model (also called demand evaluation and data recommendation model) which take mobile position data as characteristics, and focuses on management oriented to data asset application value maximization. The mobile position data asset operation method is attached to three key features of 'human-ground-time' of mobile position signaling data, a 'construction-service-feedback' virtuous cycle is emphasized in the construction of a data asset management model, basic features relying on position signaling are built for the first time, service guided by service requirements is emphasized in the asset operation process, the data asset management model is optimized through continuous evaluation of the matching degree of new service requirements and asset data and continuous feedback, the goal of maximizing asset value is gradually achieved, meanwhile, more data users are served, and the win-win situation of social benefits and benefits of asset managers is achieved.
The embodiment of the application provides a mobile position data asset operation method based on a dynamic feedback mechanism. The operation method of the dynamic feedback mechanism comprises a dynamic mechanism of interaction between a data asset management model and data application requirements (also called business requirements). Under the mechanism, the data asset management model can continuously optimize construction facing to the continuously increased data application requirements, reduce the management cost and improve the data utilization value; the data demander can also quickly know data from the characteristics of the mobile position data, and experiences the data recommendation service in a self-service manner according to the guidance demand collection process.
In the embodiment of the application, a mobile position data scene is oriented, and an effective data asset management model is built, so that the data requirement is automatically evaluated, and an optimal data sharing scheme is recommended; through the collection of numerous data requirements, an asset management model is continuously optimized, and a virtuous circle operation mechanism of 'construction-service-feedback' is formed.
FIG. 4 is a flowchart illustrating a method for managing a round-robin operation of data assets according to an embodiment of the present application; referring to fig. 4, the loop operation mechanism includes steps 401 to 403, which are closely related by "person-ground-time" service metadata according to the mobile location data feature, and fig. 5 is a detailed flowchart of the loop operation management method shown in fig. 4.
Step 401: building a data asset management model of the mobile position data;
wherein, referring to fig. 5, the step 401 may include steps 5011 to 5013:
step 5011: the data assets are hierarchically structured.
According to the characteristics of three elements of the mobile position data, namely human-ground-time, a three-layer data asset management model is constructed: the system comprises a data warehouse layer, a data mart layer and a data application layer; the data warehouse layer is set to store the preprocessed and well-organized detailed data of the mobile position, and the detailed data of the mobile position is stored in a mode of two basic data structures (people + time: location ] and [ location + time: person ], as a main component of the data warehouse layer, [ person + time: place ] indicates that a place can be obtained with a person and time as main keys, i.e., by matching the place with the person and time, [ place + time: people represents that people can be obtained by taking the place and the time as main keys, namely people are matched by the place and the time; the data of the data mart layer is set and is generated by processing the data of the data warehouse layer, and the data belongs to summarized label data which can be directly called by a user and is divided into three subjects of people, places and time. And setting data of a data application layer, processing and generating the data by data of a data warehouse layer and/or a data mart layer, and storing an intermediate calculation process table of common calling requirements by the data application layer.
For the data warehouse layer, by an enumeration method, any two elements of the three elements of the mobile position data can be selected as primary keys, and another element can be obtained, but because [ human + place: time is matched through people and places, the storage mode is far lower than the other two models in retrieval and data compressibility, the application scenes are few, the model is not suitable for being used as basic data, and the model is not suggested to be placed on a data warehouse layer. To summarize, all data on [ person + time: location ], [ location + time: people basic data, no matter what kind of database is used for storage, can be placed in a data warehouse layer, and the database can be Hive, HBase, or Redis (Remote Dictionary Server), etc.
For the data mart layer, the mart data with a theme of man may be [ person number: person attribute ], where "person number" is a key field, such as various user IDs; the personal attributes may be a number of tag fields, such as gender, age, occupation, frequent location, etc.; the bazaar data with the location/area as a topic may be [ location/area number: location/area attribute ], where "location/area number" is a key field, such as various types of area codes; the location/area attribute can be a plurality of label fields, such as comfort index, old activity area, road condition index and the like; the time-themed bazaar data may be [ time period: time period attribute ] for various report scenarios, where the "time period" is a key field, such as a holiday code (e.g. a code corresponding to a spring festival), a report type code, etc., and the time period attribute may be a plurality of tag fields, such as a population migration index, a demographic.
For the data application layer, the richness of the data application layer according to the opening of the data assets is a process from the beginning to the end, and the data application layer builds a temporary table of intermediate processes which are reserved for meeting the calling of certain data applications. As the use of data assets is increased, the logic of processing a plurality of demands tends to be consistent, and even if some demands directly use the existing intermediate process table without calling basic data, the development and calling cost generated by the intermediate process table is much lower. To this end, it is considered that part of the intermediate process temporary table is also opened in the data directory for use by the application and is attributed to the data application layer.
Step 5012: and constructing a data model framework.
The construction of the data model framework can comprise business scene modeling and data table modeling. The business scenario modeling can enumerate six possible data application atom scenarios (also referred to as business scenarios) according to three elements of 'human-ground-time'. In the data sheet modeling, consider [ person + time: location ] and [ location + time: people ] quick search requirement of common scene, the data warehouse layer searches to generate two basic real-time tables according to user and position respectively: a user trajectory sequence table and a base station (cell) slicing table; the data mart layer is mainly used for storing various light summary label tables with people or places as key values; the data application layer stores a summary table of various granularities that most applications may call to facilitate direct application calls.
For business scenario modeling, the business scenario may include: scene 1: knowing the area and time range, looking up people (groups); scene 2: knowing people (crowd) and time range, query location/area; scene 3: the known people (group) and area scope, query time (period topical report); scene 4: looking up tag attributes (gender, age, occupation, places of daily living) about a person; scene 5: find tag attributes about an area (location); scene 6: look up the "people-ground relationship" tag attribute for the time period.
Any one of the requirements may be satisfied by one or more atomic scene combinations. For example, scenario 1 may suffice to know the number of people flowing per hour for a certain business circle; to know the number of people in a certain business circle's youth population per hour, scene 1 and scene 4 combinations need to be referenced.
For data table modeling, the data warehouse layer includes real-time fact tables, offline fact tables, and related dimension tables, according to a data hierarchy design. In order to meet the real-time requirement for efficient retrieval of three elements, two basic real-time tables are respectively designed by using users and positions for retrieval: user trajectory sequence table and base station (cell) slicing table, satisfying [ person + time: location ] and [ location + time: human ] quick retrieval of scene requirements. And (3) setting and retrieving all [ people + time + place ] in the offline fact table, and meeting the requirements of scenes 1, 2 and 3. Because the time windows for performing preprocessing such as duplicate data deduplication and base station ping-pong handover from the original data are different in size, the offline fact table has a smaller storage space relative to the real-time fact table, and can store data for a longer time, as shown in fig. 3, the offline fact table may be the position zipper table 3013; the associated dimension tables may be the base station parameter dimension table 3014 and the custom area base station dimension table 315 associated with the real time fact table and the offline fact table.
The data mart layer is mainly used for storing various light summary label tables with people or places as key values. And (4) obtaining a regular data mart layer from the basic data of the data warehouse layer through model calculation or association with other types of data sources. Each light summary label table may refer to 1 or more attribute labels. Therefore, the data mart layer is also a mart that aggregates various types of tag data.
The data application layer stores various granularities of summary tables that most applications may call to facilitate direct application calls, such as summarization from a regional level or summarization from a temporal granularity.
Step 5013: and constructing a metadata specification.
Wherein the metadata specification includes a table-level metadata model and a field-level data metadata model. And for the data tables in the data model framework, metadata items of each data table are specified and configured, and service metadata of the data element field is specified.
It should be noted that the specification table level metadata may include: the hierarchy, the service scene, the data aging and the standard time delay.
Step 402: collecting and evaluating data requirements;
among them, to meet the vision of maximizing data services, a data manager is required to be able to accurately collect various data requirements; based on the established data asset management model, data requirements are collected first, and recommended data are given after the requirements are evaluated.
Wherein, the step 402 may be the following steps 5021 to 5024:
step 5021: selecting an atomic scene;
wherein, the atomic scene to which the business requirement belongs can be collected;
step 5022: judging the data timeliness;
wherein, the timeliness requirement of the business requirement can be collected;
step 5023: configuring demand parameters;
wherein, the demand parameters of the service demand can be collected;
step 5024: evaluating the demand and recommending data;
the method comprises the steps of adopting an evaluation recommendation model, taking data requirements as input, automatically generating a candidate data group through data table screening and data element matching, and finally giving candidate open data (also called as a target data group) recommended by combination and sequencing by considering two factors of data delay and data hierarchy.
Wherein the step 5024 may include steps 50241 to 50244:
step 50241: screening a data table;
step 50242: matching data elements;
step 50243: generating a candidate data set;
the tables screened in step 50243 can be freely arranged and combined to generate candidate data groups, and all tables and fields in each candidate data group can meet target requirements; one table can be repeatedly shown in a plurality of candidate data groups, enumerate all possible candidate data groups, and mark as G n Where n is the number of candidate data sets, for a e [1, n ∈ ]]Each candidate data set may be represented as G a ∈{Table 1 ,Table 2 ,Table 3 ,Tabl 4 e,Tabl 5 e }. Thus, any candidate data set can be provided to the data user to meet the target requirement, but which is recommended to be considered.
Step 50244: selecting a target recommendation data set;
two factors of data delay and data hierarchy can be considered in selecting the target recommended data group from the candidate data groups; the shorter the data time delay is, the closer the data hierarchy the data belongs to is to the upper-layer application, and the more preferential the recommendation is. Therefore, the coefficients of the configuration hierarchy for three data hierarchies of PDW (data warehouse layer), MID (data mart layer) and APP (data application layer) are L respectively 1 、L 2 、L 3 For G a Its recommendation index I a Is the sum of the level coefficients and data latencies of all tables in the candidate data set. Index of recommendation I a Can be expressed by equation (3):
Figure BDA0003172766430000201
wherein,
Figure BDA0003172766430000202
representing a level coefficient of L in the candidate data set x Table (2);
Figure BDA0003172766430000203
representing a level coefficient of L in a candidate data set x The number of the tables of (a) is,
Figure BDA0003172766430000204
a cumulative sum of products obtained by multiplying the number of tables of each of the three data levels in the candidate data set by the level coefficient of the corresponding data level;
Figure BDA0003172766430000205
representing the cumulative sum of the data delays for all tables in the candidate data set.
Hypothesis candidate data set G a ∈{Table 1 ,Table 2 ,Table 3 ,Table 4 ,Table 5 And Table 1 And Table 2 Tables, for data mart levels 3 Being tables, of a data warehouse layer 4 And Table 5 Table, which is a Table of the data application layer 1 To Table 5 The data Delay of (1) is in turn Delay 1 To Delay 5 That is, the number of tables at the data warehouse level is 1, and the number of tables at the data mart level and the data application level is 2, the candidate data group G a Index of recommendation I a Can be expressed by equation (4):
I a =L 1 *1+L 2 *2+L 3 *2+Delay 1 +Delay 2 +Delay 3 +Delay 4 +Delay 5 (4);
wherein, can be in I a At minimum, G a Selecting as a target recommendation data set; or I can be a Sorting from small to large, and selecting the first few items as recommendationsThe data set is used for reference and selection by a user, and the user selects a target recommended data set from the recommended data set; in order to recommend the user to use the non-detailed data as much as possible, L can be generally used 1 The coefficient value is adjusted greatly, if the application L is still required 1 And the layer data can start a sensitive data security approval mechanism.
Step 403: optimizing a feedback and data asset management model;
among other things, data managers can manage data from a data asset perspective, considering that data will be oriented to numerous business needs rather than one. Step 403 may include steps 5031 to 5033 as follows:
step 5031: counting the matching degree of the similar requirements;
the frequency of occurrence of each demand can be collected, and statistics of the matching degree of the demands and the assets can be carried out. The business requirements can be classified according to the similarity degree of the requirement parameters of the business requirements, the frequency of occurrence (namely the frequency of occurrence) of each type of business requirements is determined, and the higher the frequency of occurrence of the same type of business requirements is, the more urgent is the establishment of an intermediate process table meeting the business requirements; the matching degree between each service requirement and the corresponding target recommended data set can be determined, and the larger the difference of the matching degrees, the lower the degree that the table in the existing MID (data mart layer) or APP layer (data application layer) meets the service requirement is.
Suppose a certain service requirement item is D i Selecting the target recommendation data group as G a Then G can be traversed a Comparing the granularity and range attribute of three key data elements in each table with service requirement items, counting 1 if one item is consistent, and finally counting the total number M i As D i And G a The degree of matching of (2). When a new service requirement item and the service requirement item are D i When the two service requirement items are the same or similar, the two service requirement items are considered as the same type of service requirement items, and the new service requirement items are also compared with the G a The matching degree of the matching is accumulated and counted at M i In (1).
Step 5032: according to the matching degree and the occurrence frequency of the service requirements, ordering the asset optimization items;
whereinWhen a certain type of service requirements are counted and the matching degree of the service requirements with the table in the existing MID (data mart layer) or APP layer (data application layer) is low and the frequency of the service requirements is high, the existing data table needs to be optimized in the MID or APP layer, or a table directly meeting the service requirements needs to be constructed in a supplementary mode. In the daily data asset construction process, the reference can be made to the basis M i The construction of the data tables of the MID layer and the APP layer is carried out from large to small, the construction content comprises the optimization of the existing tables in the MID layer and the APP layer, the updating of data metadata (such as table level metadata or field level metadata) and the observation of whether the overall demand matching degree condition is improved or not; or optimizing the data table model, and additionally introducing new table construction until the data assets reach the optimal state meeting all data requirements for calling.
Step 5033: and comprehensively considering the asset adjustment scheme and the influence.
Before the optimization construction of the data asset management model, the influence of the optimization adjustment scheme on the whole data asset management model needs to be evaluated, and the influence of each optimization adjustment on machine resources and related supporting facilities needs to be evaluated.
When a data table design is added or changed, influence range deduction can be carried out through table or record level blood margin analysis, the fact that data on each production line possibly influence on-line services is considered, before the change, the blood margin analysis of metadata is needed, the data change range is evaluated, and the change which possibly influences on-line services needs to be cautious or a proper time is selected for adjustment.
The influence of the adjustment of the data asset design on machine resources and related supporting facilities is evaluated, and an intelligent operation and maintenance monitoring system is generally used for carrying out statistical analysis on monitoring data such as daily resource consumption of the data assets and the like, so that the calculation and storage space of the standard quantitative data scale is roughly estimated. Based on this, resource expansion and contraction which may be brought about by adjustment of data management are estimated. The project of newly increased resource capacity expansion needs to make a reasonable budget application, and the project which has less capacity expansion quantity but obviously increased benefit after adjustment is easily supported by enterprise finance.
In the embodiment of the application, the asset optimization items are sorted according to the service requirement matching degree and the occurrence frequency, so that the requirement range which is met to the maximum extent, the most convenient data positioning and use, the most fit data element matching, the minimum data delay, the maximum data quality and the like can be realized.
In the process of continuous demand collection and feedback, the construction of the data asset management model gradually exerts the maximum value to serve an asset management party and more data users.
The embodiment of the application provides a data asset management model formed by combining a data hierarchical design, a data model framework and a metadata specification, which accord with the characteristics of mobile position data.
Wherein, the key of data hierarchical design includes: in two basic data structures [ person + time: location ], [ location + time: people ], as a major component of the data warehouse layer; tag data surrounding three types of themes of human, ground and time constructed by using three element characteristics of mobile position data 'human-ground-time' is used as a data mart layer; the data storage application layer is designed to store an intermediate calculation process table retained by common call requirements.
The key to the data model framework includes: the method is characterized in that three elements of mobile position data human-ground-time are used, and six types of designed data application atomic scenes are used as business scene models. The data table model takes a user track sequence table and a base station (cell) slice table as centers; the periphery is surrounded with various light summary label tables taking people or places as key values; the periphery stores various granularities of summary tables that most applications may call to facilitate direct application calls.
The key to the metadata specification includes: the table-level metadata configuration item sets the affiliated hierarchy, the service scene, the data timeliness and the standard time delay. The field level metadata configuration items include category (people, place, time three-element keywords, tag attributes, others), scope (area scope, time span, people scope), granularity (place granularity, time granularity, people granularity).
The embodiment of the application provides a demand collection and evaluation model conforming to a dynamic feedback mechanism. The data recommendation range can be quickly narrowed down through an atomic scene of service requirements and aging category parameters in data aging, candidate data sets are generated through table combinations which meet conditions and are screened through data element matching, and finally the candidate data sets are sorted according to the required data aging and data levels.
The embodiment of the application provides an asset management model optimization method aiming at mobile position data. The method comprises the steps of matching each requirement parameter with granularity and range attributes of three key data elements, establishing a matching degree comparison table of data use requirements (namely service requirements) and the current situation of asset management, and performing preferential optimization on a data asset management model aiming at certain requirements with low matching degree and high occurrence frequency after comprehensively considering adjustment schemes and influences.
It should be noted that the embodiment of the present application is data management at a data asset level, and not function optimization at a database level. The embodiment of the application can solve the problem that a data asset manager maximizes the value of the data asset used in multi-scenario application, and does not focus on solving the problem of one or a class of application requirements.
The embodiment of the application can help data demanders to efficiently understand and use data aiming at the uniqueness of mobile position data. Because the mobile position data comes from the communication operation field, the field format of the mobile position data has certain professional uniqueness, and the frequency interval of the data generation is rich in practical significance. The design of metadata through a data asset management model helps a data demander to quickly understand data; the data demand collection and evaluation are provided, the candidate data are automatically recommended according to the demand, and a data demander can be helped to quickly find the data suitable for use.
According to the embodiment of the application, the data retention value of the intermediate layer can be displayed, and the data value conversion rate is improved. The embodiment of the application solves the problem of difficult design of a data asset manager for data hierarchical management; the method can help a data asset manager to decide which intermediate process data exist in a logic table form and which need to be stored in a disk-down manner; the method can help the same type of data demanders to have directly-retrievable public intermediate layer data, reduce the repeated development cost and shorten the data processing time.
According to the embodiment of the application, the data asset management strategy can be dynamically adjusted in a guiding mode according to the data use requirement. According to the embodiment of the application, a virtuous circle operation mechanism of 'construction-service-feedback' is adopted to help a data asset manager dynamically and scientifically adjust a data management strategy according to the requirement of upper-layer application on data use, and the data value is continuously improved.
Based on the foregoing embodiments, an embodiment of the present application provides an apparatus for operating a data asset, where the apparatus includes modules that can be implemented by a processor in an electronic device; of course, the implementation can also be realized through a specific logic circuit; in implementation, the processor may be a Central Processing Unit (CPU), a Microprocessor Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like.
Fig. 6 is a schematic structural diagram of an operation apparatus of a data asset according to an embodiment of the present application, and as shown in fig. 6, the apparatus 600 includes an obtaining module 601, a screening module 602, a first determining module 603, a second determining module 604, and an optimizing module 605, where:
an obtaining module 601, configured to obtain multiple service requirements;
a screening module 602, configured to screen, for each service requirement, a target data group corresponding to the service requirement from the data asset management model; the target data group comprises first table level metadata and first field level metadata in the data asset management model;
a first determining module 603, configured to determine a matching degree between each service requirement and the corresponding target data group;
a second determining module 604, configured to determine, as an asset optimization requirement, a service requirement of each service requirement, where an occurrence frequency of each service requirement in the multiple service requirements and a corresponding matching degree meet a specific condition;
an optimization module 605 configured to optimize the data asset management model based on the asset optimization requirement.
In one embodiment, the data asset management model includes a data warehouse layer, a data mart layer, and a data application layer; the data warehouse layer, the data mart layer and the data application layer comprise a plurality of data tables; each data table comprises table level metadata and field level metadata.
In one embodiment, the business requirements include target table level metadata and target field level metadata; the screening module 602 includes: the first screening submodule is used for screening a first candidate data table from the data asset management model according to the target table-level metadata of each service requirement; the second screening list sub-module is used for screening a second candidate data table from the first candidate data table according to the target field-level metadata of each service requirement, and screening candidate metadata from the second candidate data table; the combination submodule is used for carrying out permutation and combination on the second candidate data table to obtain a plurality of candidate data groups; and the third screening submodule is used for determining a target data group from the plurality of candidate data groups according to the data delay and the data hierarchy of the second candidate data table of each candidate data group in the plurality of candidate data groups.
In one embodiment, the third filter submodule includes: the first determining unit is used for determining the data hierarchy corresponding to each second candidate data table in each candidate data group; the second determining unit is used for determining a first recommendation index according to the number of second candidate data tables in each data level and the level coefficient of the corresponding data level; the third determining unit is used for determining a second recommendation index according to the data time delay of each second candidate data table in each candidate data group; and the screening unit is used for determining a target data set from the plurality of candidate data sets according to each first recommendation index and the corresponding second recommendation index.
In one embodiment, the screening unit includes:
and the first determining subunit is used for determining the candidate data group with the minimum sum of the first recommendation index and the second recommendation index as a target data group.
In one embodiment, the screening unit includes: the sorting subunit is configured to sort the multiple candidate data sets according to each of the first recommendation indexes and the corresponding second recommendation index; a first screening subunit, configured to determine, according to a sorting result, a plurality of first data groups from the plurality of candidate data groups; and the second screening subunit is used for responding to the received selection instruction and determining a target data group from the plurality of first data groups.
In one embodiment, the second determining module 604 includes: and the first determining submodule is used for determining the service requirement with the highest frequency and the lowest matching degree in the plurality of service requirements as the asset optimization requirement.
In one embodiment, the second determining module 604 includes: the second determining submodule is used for determining the priority of the corresponding service requirement according to the occurrence frequency and the corresponding matching degree of each service requirement in the plurality of service requirements; and the third determining submodule is used for determining the service requirement with the priority meeting the specific condition as the asset optimization requirement.
In one embodiment, the business requirement includes target table level metadata and target field level metadata; the optimization module 605 includes: the establishing submodule is used for establishing a corresponding first data table based on each asset optimization requirement in the at least one asset optimization requirement; the adding submodule is used for adding at least one first data table into the data asset management model;
in one embodiment, the optimization module 605 includes: and the updating submodule is used for updating the table-level metadata and the field-level metadata of the corresponding target data table in the data asset management model based on the target table-level metadata and the target field-level metadata of each asset optimization requirement in the at least one asset optimization requirement.
In one embodiment, the apparatus further comprises: the first establishing module is used for respectively establishing data tables stored in the data warehouse layer, the data mart layer and the data application layer; the second establishing module is used for establishing table-level metadata and field-level metadata corresponding to each data table; a third establishing module for establishing the data asset management model including the data warehouse layer, the data mart layer, and the data application layer.
Fig. 7 is a schematic structural diagram of a device for building a data asset management model according to an embodiment of the present application, and as shown in fig. 7, the device 700 includes a first building module 701, a second building module 702, and a third building module 703, where:
a first establishing module 701, configured to generate data tables stored in a data warehouse layer, a data mart layer, and a data application layer according to the three elements of the terminal signaling data; the three elements comprise user identification information, user position information and time points;
a second establishing module 702, configured to establish, according to the three elements of the terminal signaling data, table-level metadata and field-level metadata corresponding to each data table, where the data warehouse layer is configured to store detailed data in metadata of the preprocessed terminal signaling data; the detailed data matches the user position information with the user identification information and the time point, or matches the user identification information with the user position information and the time point; the data mart layer is used for storing tag data in the metadata of the terminal signaling data processed by the data warehouse layer; the tag data is divided into user identification information, user position information and time points according to topics; the data application layer is used for storing a commonly used intermediate calculation process table in the metadata of the terminal signaling data processed by the data warehouse layer or the data mart layer;
a third establishing module 703, configured to establish the data asset management model including the data warehouse layer, the data mart layer, and the data application layer.
In one embodiment, the first establishing module 701 includes: the first establishing submodule is used for establishing a plurality of basic real-time data tables in the data warehouse layer according to the user identification information and the user position information of the terminal signaling data; establishing an offline fact data table according to the user identification information, the user position information and the time point of the terminal signaling data; establishing other data tables related to the basic real-time data table and the offline fact data table according to the basic real-time data table and the offline fact data table; the second establishing submodule is used for summarizing data in a plurality of basic real-time data tables, the offline fact data table and other data tables in the data warehouse layer in the data mart layer and establishing a summarizing tag data table comprising a plurality of types of tag data; and the third establishing submodule is used for processing data in a plurality of basic real-time data tables, the offline fact data table and other data tables in the data warehouse layer and/or a summary tag data table in the data mart layer in the data application layer and establishing a summary granularity data table containing a plurality of granularity data.
In one embodiment, the table-level metadata includes an affiliation level, a business scenario, a data age, and a data latency; the field level metadata includes a category, a range, and a granularity; the second establishing module 702 includes: a fourth establishing submodule, configured to establish an affiliated level of each data table according to an affiliated level of user identification information of the terminal signaling data, an affiliated level of user location information, and an affiliated level of a time point; the affiliated level is one of the data warehouse level, the data mart level, and the data application level; a fifth establishing submodule, configured to select at least one service scene from pre-established service scenes as a service scene of each data table according to a retrieval relationship among user identification information, user location information, and a time point of the terminal signaling data; a sixth establishing submodule, configured to establish a data timeliness of each data table according to a category to which a time point of the terminal signaling data belongs; the data aging is an off-line meter or a real-time meter; a seventh establishing submodule, configured to establish a data delay of each data table according to a maximum value of a time point of the terminal signaling data; the data time delay is the average processing time from the original data to the corresponding data table; an eighth establishing sub-module, configured to establish a person category, a location category, a time category, and a tag attribute of each data table according to a category to which user identification information of the terminal signaling data belongs, a category to which user location information belongs, and a category to which a time point belongs; a ninth establishing sub-module, configured to establish a person range, a place range, a time range, and a tag value of each data table according to an affiliated range of user identification information, an affiliated range of user location information, and an affiliated range of a time point of the terminal signaling data; and the tenth establishing submodule is used for establishing the figure granularity, the place granularity and the time granularity of each data table according to the minimum unit of the user identification information, the minimum unit of the user position information and the minimum unit of the time point of the terminal signaling data.
It should be noted that, in the embodiment of the present application, if the operation method of the data asset is implemented in the form of a software functional module and is sold or used as a stand-alone product, the operation method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or a part contributing to the related art may be embodied in the form of a software product stored in a storage medium, and including a plurality of instructions for enabling an electronic device (which may be a mobile phone, a tablet computer, a desktop computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensing device, etc.) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
Correspondingly, an embodiment of the present application provides an electronic device, fig. 8 is a schematic diagram of a hardware entity of the electronic device according to the embodiment of the present application, and as shown in fig. 8, the hardware entity of the electronic device 800 includes: the system comprises a memory 801 and a processor 802, wherein the memory 801 stores a computer program capable of running on the processor 802, and the processor 802 executes the program to realize the steps of the operation method of the data asset or the establishment method of the data asset management model of the embodiment.
The Memory 801 is configured to store instructions and applications executable by the processor 802, and may also buffer data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or already processed by the processor 802 and modules in the electronic device 800, and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM).
Accordingly, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps in the operation method of the data asset or the establishment method of the data asset management model provided in the above embodiments.
Here, it should be noted that: the above description of the storage medium and device embodiments, similar to the above description of the method embodiments, has similar advantageous effects as the device embodiments. For technical details not disclosed in the embodiments of the storage medium and method of the present application, reference is made to the description of the embodiments of the apparatus of the present application for understanding.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of another like element in a process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or in other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit may be implemented in the form of hardware, or in the form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or a part contributing to the related art may be embodied in the form of a software product stored in a storage medium, and including a plurality of instructions for enabling a computer device (which may be a mobile phone, a tablet computer, a desktop computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensing device, etc.) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.
The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to arrive at new method embodiments. The features disclosed in the several product embodiments presented in this application can be combined arbitrarily, without conflict, to arrive at new product embodiments. The features disclosed in the several method or apparatus embodiments provided in the present application may be combined arbitrarily, without conflict, to arrive at new method embodiments or apparatus embodiments.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present application, and shall cover the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (13)

1. A method of operating a data asset, the method comprising:
acquiring a plurality of service requirements;
aiming at each service demand, screening out a target data group corresponding to the service demand from a data asset management model; the target data group comprises first table level metadata and first field level metadata in the data asset management model;
determining the matching degree between each business requirement and the corresponding target data group;
determining the business requirements, in each business requirement, of which the occurrence frequency and the corresponding matching degree of each business requirement in a plurality of business requirements meet specific conditions, as asset optimization requirements;
optimizing the data asset management model based on the asset optimization requirements.
2. The method of claim 1, wherein the business requirements comprise target table level metadata and target field level metadata; the step of screening out a target data group corresponding to each business requirement from the data asset management model aiming at each business requirement comprises the following steps:
screening a first candidate data table from a data asset management model according to the target table level metadata of each service requirement;
screening a second candidate data table from the first candidate data table according to the target field level metadata of each service requirement, and screening candidate metadata from the second candidate data table;
arranging and combining the second candidate data table to obtain a plurality of candidate data groups;
and determining a target data set from the plurality of candidate data sets according to the data time delay and the data hierarchy of the second candidate data table of each candidate data set in the plurality of candidate data sets.
3. The method of claim 2, wherein the target table level metadata comprises a target level to which the business needs belong, a target business scenario, and a target data age; the target field level metadata includes a target category, a target range, and a target granularity;
the screening out a first candidate data table from the data asset management model according to the target table level metadata of each service requirement comprises: screening a first candidate data table from the data asset management model according to the target hierarchy of each service demand, the target service scene and the target data timeliness;
the screening a second candidate data table from the first candidate data table according to the target field level metadata of each service requirement, and screening candidate metadata from the second candidate data table, including: screening a second candidate data table from the first candidate data table according to the target category, the target range and the target granularity of each service requirement, and screening candidate metadata from the second candidate data table; the candidate metadata is metadata in the second candidate data table related to the target category, the target range, and the target granularity.
4. The method of claim 2, wherein determining a target dataset from the plurality of candidate datasets based on the data latency and data hierarchy of the second candidate data table of each of the plurality of candidate datasets comprises:
determining a data level corresponding to each second candidate data table in each candidate data group;
determining a first recommendation index according to the number of second candidate data tables in each data level and the level coefficient of the corresponding data level;
determining a second recommendation index according to the data time delay of each second candidate data table in each candidate data group;
and determining a target data set from the plurality of candidate data sets according to each first recommendation index and the corresponding second recommendation index.
5. The method of claim 4, wherein determining a target data set from the plurality of candidate data sets based on each of the first recommendation indexes and the corresponding second recommendation index comprises:
determining the candidate data group with the minimum sum of the first recommendation index and the second recommendation index as a target data group;
or,
sorting the plurality of candidate data sets according to each first recommendation index and the corresponding second recommendation index; determining a plurality of first data groups from the plurality of candidate data groups according to the sorting result; in response to the received pick instruction, a target data set is determined from the plurality of first data sets.
6. The method according to any one of claims 1 to 5, wherein the determining at least one of the business requirements whose frequency of occurrence and corresponding degree of matching satisfy a specific condition as an asset optimization requirement comprises:
determining the service requirement with the highest frequency and the lowest matching degree in the plurality of service requirements as an asset optimization requirement;
or,
determining the priority of the corresponding service requirement according to the occurrence frequency and the corresponding matching degree of each service requirement in the plurality of service requirements; and determining the service requirement with the priority meeting the specific condition as an asset optimization requirement.
7. The method of claim 6, wherein the business requirements comprise target table level metadata and target field level metadata;
optimizing the data asset management model based on the asset optimization requirements includes:
establishing a first data table based on the asset optimization requirements; adding the first data table to the data asset management model;
or,
updating first table level metadata and first field level metadata of a target data table in the data asset management model based on the target table level metadata and the target field level metadata of the asset optimization requirements.
8. The method according to any one of claims 1 to 5, further comprising:
respectively generating data tables stored in a data warehouse layer, a data mart layer and a data application layer according to the three elements of the terminal signaling data; the three elements comprise user identification information, user position information and time points;
establishing table-level metadata and field-level metadata corresponding to each data table according to the three elements of the terminal signaling data, wherein the data warehouse layer is used for storing detailed data in the metadata of the preprocessed terminal signaling data; the detailed data matches the user position information with the user identification information and the time point, or matches the user identification information with the user position information and the time point; the data mart layer is used for storing tag data in the metadata of the terminal signaling data processed by the data warehouse layer; the tag data is divided into user identification information, user position information and time points according to themes; the data application layer is used for storing a commonly used intermediate calculation process table in the metadata of the terminal signaling data processed by the data warehouse layer or the data mart layer;
establishing the data asset management model comprising the data warehouse layer, the data mart layer, and the data application layer.
9. The method according to claim 8, wherein the establishing data tables stored in the data warehouse layer, the data mart layer and the data application layer according to the three elements of the terminal signaling data respectively comprises:
in the data warehouse layer, establishing a plurality of basic real-time data tables according to the user identification information and the user position information of the terminal signaling data; establishing an offline fact data table according to the user identification information, the user position information and the time point of the terminal signaling data; establishing other data tables related to the basic real-time data table and the offline fact data table according to the basic real-time data table and the offline fact data table;
in the data mart layer, summarizing data in a plurality of basic real-time data tables, the offline fact data tables and other data tables in the data warehouse layer, and establishing a summarized label data table comprising a plurality of label data;
and in the data application layer, processing data in a plurality of basic real-time data tables, the offline fact data tables and other data tables in the data warehouse layer and/or summary label data tables in the data mart layer, and establishing a summary granularity data table containing a plurality of granularity data.
10. The method of claim 8, wherein the table-level metadata comprises an affiliation level, a traffic scenario, a data age, and a data latency; the field level metadata includes a category, a range, and a granularity; the establishing of the table-level metadata and the field-level metadata corresponding to each data table according to the three elements of the terminal signaling data includes:
establishing a belonged level of each data table according to a belonged level of user identification information, a belonged level of user position information and a belonged level of a time point of the terminal signaling data; the affiliated level is one of the data warehouse level, the data mart level, and the data application level;
selecting at least one service scene from pre-established service scenes as the service scene of each data table according to the retrieval relationship among the user identification information, the user position information and the time point of the terminal signaling data;
establishing data timeliness of each data table according to the category of the time point of the terminal signaling data; the data aging is an off-line meter or a real-time meter;
establishing data time delay of each data table according to the maximum value of the time point of the terminal signaling data; the data time delay is the average processing time from the original data to the corresponding data table;
establishing a person type, a place type, a time type and a label attribute of each data table according to the type of the user identification information of the terminal signaling data, the type of the user position information and the type of the time point;
establishing a person range, a place range, a time range and a label value of each data table according to the affiliated range of the user identification information, the affiliated range of the user position information and the affiliated range of the time point of the terminal signaling data;
and establishing the figure granularity, the place granularity and the time granularity of each data table according to the minimum unit of the user identification information, the minimum unit of the user position information and the minimum unit of the time point of the terminal signaling data.
11. An apparatus for operating a data asset, the apparatus comprising:
the acquisition module is used for acquiring a plurality of service requirements;
the screening module is used for screening a target data group corresponding to each service requirement from the data asset management model according to the service requirement; the target data group comprises first table level metadata and first field level metadata in the data asset management model;
the first determining module is used for determining the matching degree between each business requirement and the corresponding target data group;
the second determining module is used for determining the business requirements, of which the occurrence frequency and the corresponding matching degree of each business requirement in the multiple business requirements meet specific conditions, as asset optimization requirements;
and the optimization module is used for optimizing the data asset management model based on the asset optimization requirement.
12. An electronic device comprising a memory and a processor, the memory storing a computer program operable on the processor, wherein the processor when executing the program performs the steps in the method of operating a data asset of any of claims 1 to 10.
13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of a method for operating a data asset according to any one of claims 1 to 10.
CN202110823573.4A 2021-07-21 2021-07-21 Data asset operation method and device, electronic equipment and storage medium Pending CN115687438A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110823573.4A CN115687438A (en) 2021-07-21 2021-07-21 Data asset operation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110823573.4A CN115687438A (en) 2021-07-21 2021-07-21 Data asset operation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115687438A true CN115687438A (en) 2023-02-03

Family

ID=85044317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110823573.4A Pending CN115687438A (en) 2021-07-21 2021-07-21 Data asset operation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115687438A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118469703A (en) * 2024-07-11 2024-08-09 宁波银行股份有限公司 Service processing method and service processing platform

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118469703A (en) * 2024-07-11 2024-08-09 宁波银行股份有限公司 Service processing method and service processing platform

Similar Documents

Publication Publication Date Title
Saluveer et al. Methodological framework for producing national tourism statistics from mobile positioning data
Lansley et al. The geography of Twitter topics in London
US10304086B2 (en) Techniques for estimating demographic information
Zhong et al. Detecting the dynamics of urban structure through spatial network analysis
US20120174006A1 (en) System, method, apparatus and computer program for generating and modeling a scene
CN106651424A (en) Electric power user figure establishment and analysis method based on big data technology
Falcone et al. What is this place? Inferring place categories through user patterns identification in geo-tagged tweets
US8255392B2 (en) Real time data collection system and method
CN108495254B (en) Traffic cell population characteristic estimation method based on signaling data
CN106326923B (en) A kind of position data clustering method of registering taking position into account and repeating with density peaks point
CN112270579B (en) Intelligent advertising system based on big data
CN112738729A (en) Method and system for distinguishing visiting hometown visitor by mobile phone signaling data
Manley et al. New forms of data for understanding urban activity in developing countries
CN110059149A (en) Electronic map spatial key Querying Distributed directory system and method
CN115408618B (en) Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
CN115687438A (en) Data asset operation method and device, electronic equipment and storage medium
CN114219379B (en) Resource matching evaluation method and system suitable for community service circle
Li et al. Delineation of the Shanghai megacity region of China from a commuting perspective: Study based on cell phone network data in the Yangtze River Delta
Zhang et al. Detecting tourist attractions using geo-tagged photo clustering
CN111859055A (en) Intelligent data retrieval matching system based on big data
Celikten et al. Extracting patterns of urban activity from geotagged social data
CN116028467A (en) Intelligent service big data modeling method, system, storage medium and computer equipment
CN115269970A (en) Intelligent search method and system for government affair service mobile terminal
Liu et al. An unsupervised collaborative approach to identifying home and work locations
Wei et al. SP-Loc: A crowdsourcing fingerprint based shop-level indoor localization algorithm integrating shop popularity without the indoor map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination