CN103902653B - A kind of method and apparatus for building data warehouse table genetic connection figure - Google Patents
A kind of method and apparatus for building data warehouse table genetic connection figure Download PDFInfo
- Publication number
- CN103902653B CN103902653B CN201410072773.0A CN201410072773A CN103902653B CN 103902653 B CN103902653 B CN 103902653B CN 201410072773 A CN201410072773 A CN 201410072773A CN 103902653 B CN103902653 B CN 103902653B
- Authority
- CN
- China
- Prior art keywords
- data warehouse
- sentence
- name
- operations
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method and apparatus for building data warehouse table genetic connection figure, belong to computer realm.This method includes:Parsing accesses each data warehouse operations sentence of data warehouse, obtains the table name for the data warehouse purpose table that each data warehouse operations sentence is accessed;The sentence mark and the corresponding relation of the table name of the data warehouse purpose table accessed of each data warehouse operations sentence are stored in mapping table;According to mapping table, the table name of the corresponding data warehouse source table of each data warehouse purpose table in mapping table is obtained;According to the table name of each data warehouse purpose table and the table name of the corresponding data warehouse source table of each data warehouse purpose table, data warehouse table genetic connection figure is built.The device includes:Parsing module, the first memory module, the first acquisition module and structure module.Server can build data warehouse genetic connection figure automatically in the present invention.
Description
Technical field
The present invention relates to computer realm, more particularly to a kind of method and dress for building data warehouse table genetic connection figure
Put.
Background technology
Be stored with various business datums in data warehouse, and different business datums is stored in different traffic tables
In.Therefore, be stored with multiple traffic tables in data warehouse, how the multiple traffic tables stored in data warehouse is built into data
The problem of warehouse table genetic connection figure is in the urgent need to address.
At present, all it is that data warehouse management personnel parse data warehouse operations sentence and build data warehouse table genetic connection
Figure.And during data warehouse management personnel structure data warehouse table genetic connection figure, easily error;Also, the industry in data warehouse
Data volume of being engaged in is very big, causes the workload of data warehouse management personnel big.
The content of the invention
In order to solve problem of the prior art, the invention provides a kind of method for building data warehouse table genetic connection figure
And device.The technical scheme is as follows:
On the one hand there is provided a kind of method for building data warehouse table genetic connection figure, methods described includes:
Parsing accesses each data warehouse operations sentence of data warehouse, obtains each data warehouse operations sentence and visits
The table name for the data warehouse purpose table asked;
By the sentence mark and pair of the table name of the data warehouse purpose table of access of each data warehouse operations sentence
It should be related to and be stored in mapping table;
According to the mapping table, the corresponding data of each data warehouse purpose table in the mapping table are obtained
The table name of warehouse source table;
According to the table name of each data warehouse purpose table and the corresponding data warehouse of each data warehouse purpose table come
The table name of source table, builds data warehouse table genetic connection figure.
Further, the parsing accesses each data warehouse operations sentence of data warehouse, obtains each data
The table name for the data warehouse purpose table that warehouse operation sentence is accessed, including:
The parsing each data warehouse operations sentence for accessing data warehouse, obtains each of the access data warehouse
The corresponding access mode of data warehouse operations sentence;
Obtain the data warehouse operations sentence that access mode is WriteMode;
The data warehouse operations sentence that the access mode is WriteMode is parsed, it is WriteMode to obtain the access mode
The table name of all data warehouse purpose tables accessed.
Further, the parsing accesses each data warehouse operations sentence of data warehouse, obtains each data
After the table name for the data warehouse purpose table that warehouse operation sentence is accessed, methods described also includes:
Obtain data warehouse operations sentence and corresponding lead-in path that task type is lead-in type;
Task type is obtained according to the lead-in path to be analysis type and there is the data warehouse of the lead-in path to grasp
Make sentence;
It is analysis type to bind data warehouse operations sentence and the task type that the task type is lead-in type
And the data warehouse operations sentence with the lead-in path.
Further, it is described according to the mapping table, obtain each data warehouse mesh in the mapping table
The corresponding data warehouse of table originate the table name of table, including:
For every record in the mapping table, the data warehouse operations sentence stored in the acquisition record
Sentence identifies the table name with data warehouse purpose table;
Data warehouse operations sentence is obtained according to the sentence of acquisition mark;
The data warehouse operations sentence of the acquisition is parsed, the corresponding data bins of each data warehouse purpose table are obtained
The table name of storehouse source table.
Further, the table name of each data warehouse purpose table of the basis is corresponding with each data warehouse purpose table
Data warehouse originate table table name, build data warehouse table genetic connection figure, including:
In data warehouse table genetic connection figure, the corresponding node of table name of the data warehouse purpose table, and structure are built
Build the corresponding node of table name of the corresponding data warehouse source table of the data warehouse purpose table;
It regard the corresponding node of table name of the data warehouse purpose table as the corresponding data of the data warehouse purpose table
The child node of the corresponding node of table name of warehouse source table.
Further, after the corresponding node of table name for building the data warehouse purpose table, methods described is also wrapped
Include:
The data warehouse operations sentence for accessing the data warehouse purpose table is stored in the data warehouse purpose table
In the corresponding node of table name;
The data warehouse table genetic connection figure is sent to terminal, user is shown to by the terminal.
On the other hand, the invention provides a kind of device for building data warehouse table genetic connection figure, described device includes:
Parsing module, each data warehouse operations sentence of data warehouse is accessed for parsing, each data are obtained
The table name for the data warehouse purpose table that warehouse operation sentence is accessed;
First memory module, for the sentence of each data warehouse operations sentence to be identified to the data warehouse with accessing
The corresponding relation of the table name of purpose table is stored in mapping table;
First acquisition module, for according to the mapping table, obtaining each data bins in the mapping table
The table name of the corresponding data warehouse source table of storehouse purpose table;
Module is built, it is corresponding with each data warehouse purpose table for the table name according to each data warehouse purpose table
Data warehouse originate table table name, build data warehouse table genetic connection figure.
Further, the parsing module, including:
First resolution unit, each data warehouse operations sentence for parsing the access data warehouse, obtains described
Access the corresponding access mode of each data warehouse operations sentence of data warehouse;
Acquiring unit, for obtaining the data warehouse operations sentence that access mode is WriteMode;
Second resolution unit, for parsing the data warehouse operations sentence that the access mode is WriteMode, obtains described
Access mode is the table name of all data warehouse purpose tables of the access of WriteMode.
Further, described device also includes:
Second acquisition module, is the data warehouse operations sentence of lead-in type and corresponding for obtaining task type
Lead-in path;
3rd acquisition module, for obtaining task type for analysis type according to the lead-in path and being imported with described
The data warehouse operations sentence in path;
Binding module, for binding the data warehouse operations sentence and the task class that the task type is lead-in type
Type is analysis type and the data warehouse operations sentence with the lead-in path.
Further, first acquisition module, including:
First acquisition unit, for for every record in the mapping table, obtaining what is stored in the record
The sentence mark and the table name of data warehouse purpose table of data warehouse operations sentence;
Second acquisition unit, for obtaining data warehouse operations sentence according to the sentence of acquisition mark;
3rd resolution unit, the data warehouse operations sentence for parsing the acquisition, obtains each data warehouse
The table name of the corresponding data warehouse source table of purpose table.
Further, the structure module, including:
Construction unit, in data warehouse table genetic connection figure, building the table name pair of the data warehouse purpose table
The node answered, and build the corresponding node of table name of the corresponding data warehouse source table of the data warehouse purpose table;
As unit, for regarding the corresponding node of table name of the data warehouse purpose table as the data warehouse purpose
The child node of the corresponding node of table name of the corresponding data warehouse source table of table.
Further, described device also includes:
Second memory module, it is described for the data warehouse operations sentence for accessing the data warehouse purpose table to be stored in
In the corresponding node of table name of data warehouse purpose table;
Sending module, for the data warehouse table genetic connection figure to be sent into terminal, use is shown to by the terminal
Family.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every
The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed
The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table
The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure
The speed of data warehouse table genetic connection and the degree of accuracy.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, makes required in being described below to embodiment
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for
For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings
Accompanying drawing.
Fig. 1 is a kind of method flow diagram for structure data warehouse table genetic connection figure that the embodiment of the present invention 1 is provided;
Fig. 2 is a kind of method flow diagram for structure data warehouse table genetic connection figure that the embodiment of the present invention 2 is provided;
Fig. 3 is a kind of apparatus structure schematic diagram for structure data warehouse table genetic connection figure that the embodiment of the present invention 3 is provided.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention
Formula is described in further detail.
Embodiment 1
The embodiments of the invention provide a kind of method for building data warehouse table genetic connection figure.Referring to Fig. 1, wherein, should
Method includes:
Step 101:Parsing accesses each data warehouse operations sentence of data warehouse, obtains each data warehouse operations language
The table name for the data warehouse purpose table that sentence is accessed;
Step 102:By the sentence mark and the table name of the data warehouse purpose table accessed of each data warehouse operations sentence
Corresponding relation be stored in mapping table;
Step 103:According to mapping table, the corresponding data of each data warehouse purpose table in mapping table are obtained
The table name of warehouse source table;
Step 104:According to the table name of each data warehouse purpose table and the corresponding data bins of each data warehouse purpose table
The table name of storehouse source table, builds data warehouse table genetic connection figure.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every
The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed
The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table
The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure
The speed of data warehouse table genetic connection and the degree of accuracy.
Embodiment 2
The embodiments of the invention provide a kind of method for building data warehouse table genetic connection figure.Referring to Fig. 2 wherein, the party
Method includes:
Step 201:Server obtains the data warehouse behaviour of the access data warehouse of each service point from data warehouse
Make sentence;
Specifically, server obtains each industry for belonging to same type of service according to type of service from data warehouse
The data warehouse operations sentence of the access data warehouse of business point, and be each data warehouse operations sentence distribution sentence mark.
Wherein, the corresponding pass of data warehouse operations sentence of the type of service that is stored with server with accessing data warehouse
System, can obtain according to type of service from corresponding relation of the type of service with the data warehouse operations sentence of access data warehouse
Take the data warehouse operations sentence of the access data warehouse of each service point corresponding with type of service.
All it is the data warehouse operations sentence by accessing data warehouse in data warehouse, wherein it is desired to illustrate
Data in data warehouse are operated.Accessing the data warehouse operations sentence of data warehouse can be:
insert overwrite table hive_table_b
Select*
From hive_table_a
……
Step 202:Server parsing accesses each data warehouse operations sentence of data warehouse, obtains each data warehouse
The table name for the data warehouse purpose table that action statement is accessed;
Specifically, server parsing accesses each data warehouse operations sentence of data warehouse, obtains accessing data warehouse
The corresponding access mode of each data warehouse operations sentence, the access mode includes read mode and WriteMode, and WriteMode includes
Write data to data warehouse and write data to local file.Server obtains each data warehouse that access mode is WriteMode
Action statement, and pass through each data warehouse operations language of the first matching regular expressions rule parsing access mode for WriteMode
Sentence, searches out the table name for the data warehouse purpose table that each data warehouse operations sentence is accessed.
Wherein, the first matching regular expressions rule is as follows:
(1), character " behind insert into ", and "(" before one do not include blank character and the " word of ("
Symbol string;
(2), character " behind replace into ", and character "(" before one do not include blank character and "(”
Character string;
(3), character " one behind insert overwrite table " do not include blank character character string;
(4), character " one behind overwrite into table " do not include blank character character string.
, wherein it is desired to explanation, above rule is concurrency relation, if any rule is set up, also, in this hair
In bright embodiment, all characters all do not differentiate between alphabet size and write.
Wherein, the first regular expression code is:
(?i)insert\\s+into\\s+([^\\s\\(]+)\\s*\\(|(?i)replace\\s+into\\s+
([^\\s\\(]+)\\s*\\(|(?i)i nsert\\s+overwrite\\s+table\\s+(\\S+)\\s+|(?i)
overwrite\\s+into\\s+table\\s+(\\S+)\\s+。
, wherein it is desired to which the various business datums that are stored with explanation, data warehouse, server passes through these
The business datum of magnanimity level is loaded onto hive layers of source table of data warehouse, and further according to different business, hive layers of source table are entered
The different extraction of row, cleaning and conversion, obtain various hive layer services tables.Server is based on these hive layer services again
Table, according to the business needs of more sub-layers, the more sub-layers background carried out many times is split, and obtains more hive layer services tables, side
Just different business diagnosis is counted.Also, server can also be according to different business, by the business number in hive layer service tables
According to carrying out after analytic statistics, import in mysql layer service tables, do web page display.Some special circumstances, can also be to mysql layers
The business datum of traffic table is done further to extract and calculated, and web page display is done with more convenient.
Wherein, above-mentioned matching regular expressions rule step is passed through(1)With(2)Mysql layers of table name is may search for out, is walked
Suddenly(3)With(4)It may search for out hive layers of table name.
Step 203:Server parsing accesses each data warehouse operations sentence of data warehouse, obtains accessing data warehouse
Each data warehouse operations sentence task type;
Wherein, task type includes:Analysis, importing and extraction.Hive layer services are split and the analysis of hive layer services belongs to
Analysis type;Business datum in hive layer service tables imports mysql layer service tables and belongs to lead-in type;Mysql layer service tables
Business datum extract calculating belong to extraction type.
Step 204:Server obtains task type for the data warehouse operations sentence of lead-in type and corresponding led
Enter path;
Wherein, can data storage warehouse operation sentence and importing road in server if task type is lead-in type
The corresponding relation in footpath, the data warehouse operations sentence of lead-in type and corresponding importing road can be obtained according to task type
Footpath.
For example, the data warehouse operations sentence that task type is lead-in type is:
Insert into mysql_table_a(key1,col1,col2)
values(:key1,:col1,:col2)
ON DUPLICATE KEY UPDATE
col1=values(col1),
col2=values(col2);
Lead-in path corresponding with above-mentioned data base manipulation statement is "/path1/path2/ ".
Further, server reads lead-in path for " all content of text, are performed under/path1/path2/ " catalogues
Following action statement:
Insert into mysql_table_a(key1,col1,col2)
values(:key1,:col1,:col2)
ON DUPLICATE KEY UPDATE
col1=values(col1),
col2=values(col2)。
Step 205:Server obtains task type for analysis type according to the lead-in path and has the lead-in path
Data warehouse operations sentence;
Specifically, server obtains the data warehouse operations sentence that task type is analysis type, and from the task of acquisition
Type has for the data warehouse operations sentence obtained in the data warehouse operations sentence of analysis type with lead-in type in step 204
There is the data warehouse operations sentence of identical lead-in path.
Wherein, task type is the data warehouse purpose table of the data warehouse operations sentence of lead-in type, can be corresponded at least
Two data warehouse operations sentences.One of data warehouse operations sentence is the data bins that current task type is lead-in type
Storehouse action statement, data warehouse operations sentence in addition is that have with task type for the data warehouse operations sentence of lead-in type
The data warehouse operations sentence of the analysis type of identical lead-in path.
For example, task type is analysis type and has identical lead-in path with the data warehouse operations sentence of lead-in type
Data warehouse operations sentence be:
insert overwrite local directory‘/path1/path2/path3/’
select count(1)
from hive_table_a
…
Step 206:Server binding task type is that the data warehouse operations sentence and task type of lead-in type are to divide
Analyse type and the data warehouse operations sentence with the lead-in path;
Wherein, server binding task type is that the data warehouse operations sentence and task type of lead-in type are analysis classes
Type and the data warehouse operations sentence with the lead-in path in step 204, so as to set up mysql layers of table name and hive
The incidence relation of layer table name.
Step 207:Server identifies the sentence of each data warehouse operations sentence in the data warehouse purpose table with accessing
The corresponding relation of table name be stored in mapping table;
Specifically, server is by the data warehouse purpose table of access and task type corresponding with data warehouse purpose table
For lead-in type data warehouse operations sentence sentence mark and task type be analysis type and with the data of lead-in type
The sentence mark that warehouse operation sentence has the data warehouse operations sentence of identical lead-in path is stored in mapping table.
Wherein, server identifies the sentence of each data warehouse operations sentence in the table of the data warehouse purpose table with accessing
The corresponding relation of name is stored in mapping table, and the sentence mark of server based on data warehouse operation sentence can obtain visit
Access data warehouse operations sentence where the table name for the data warehouse purpose table asked.
Step 208:Server obtains each data warehouse purpose table correspondence in mapping table according to mapping table
Data warehouse originate table table name;
Wherein, step 208 specifically may include steps of(1)Extremely(3):
(1), for every in mapping table record, server obtains the data warehouse operations sentence stored in record
Sentence mark and data warehouse purpose table table name;
Specifically, for every record in mapping table, server obtains every note in mapping table successively
The sentence mark and the table name of data warehouse purpose table of the data warehouse operations sentence stored in record.
(2), server according to the sentence of acquisition mark obtain data warehouse operations sentence;
Wherein, the sentence that is stored with server identifies the corresponding relation with data warehouse operations sentence, is identified according to sentence
It can be obtained and the corresponding data warehouse operations of sentence mark from the corresponding relation of sentence mark and data warehouse operations sentence
Sentence.
(3), server parsing obtain data warehouse operations sentence, obtain the corresponding data of each data warehouse purpose table
The table name of warehouse source table.
Specifically, the data warehouse operations sentence that server is obtained by the second matching regular expressions rule parsing, is searched
The table name for the corresponding source database warehouse table of data warehouse purpose table that the data warehouse operations sentence that rope goes out acquisition includes.
Wherein, the second regular expression code is as follows:
"(?i)\\s+"+table+"(\\s+|$|;)"。
Wherein, in embodiments of the present invention, server judges whether include data in the data warehouse operations sentence obtained
The table name of warehouse purpose table, if comprising parsing obtained table name as data bins in the data warehouse operations sentence obtained
The table name of the corresponding data warehouse source table of storehouse purpose table.
When the task type of the action statement of data warehouse is analysis type, server only needs to match hive layers of table
Name;When the task type of the action statement of data warehouse is lead-in type or extraction type, server only needs to matching
Mysql layers of table name.
Further, server can be corresponding with each data warehouse purpose table by the table name of each data warehouse purpose table
The table name of data warehouse source table be stored in the second mapping table, can be with according to the content in the second mapping table
Build complete database genetic connection figure.
Step 209:Server is corresponding according to the table name and each data warehouse purpose table of each data warehouse purpose table
The table name of data warehouse source table, builds data warehouse table genetic connection figure;
Wherein, step 209 specifically may comprise steps of(1)Extremely(2):
(1), server in data warehouse table genetic connection figure, build data warehouse purpose table the corresponding section of table name
Point, and build the corresponding node of table name of the corresponding data warehouse source table of data warehouse purpose table;
(2), server regard the corresponding node of table name of data warehouse purpose table as the corresponding number of data warehouse purpose table
According to the child node of the corresponding node of table name of warehouse source table.
Step 210:The data warehouse operations sentence for accessing data warehouse purpose table is stored in data warehouse mesh by server
Table the corresponding node of table name in;
Wherein, in data warehouse table genetic connection figure, server grasps the data warehouse for accessing data warehouse purpose table
It is stored in as sentence in the corresponding node of table name of data warehouse purpose table, table name correspondence of the server in data warehouse purpose table
Node in can just obtain access data warehouse purpose table data warehouse operations sentence.
Step 211:Data warehouse genetic connection figure is sent to terminal by server;
Step 212:The data warehouse genetic connection figure that terminal the reception server is sent, and by data warehouse genetic connection figure
It is shown to user.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every
The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed
The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table
The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure
The speed of data warehouse table genetic connection and the degree of accuracy.
Embodiment 3
The embodiments of the invention provide a kind of device for building data warehouse table genetic connection figure.Referring to Fig. 3, wherein, should
Device includes:
Parsing module 301, each data warehouse operations sentence of data warehouse is accessed for parsing, each data bins are obtained
The table name for the data warehouse purpose table that storehouse action statement is accessed;
First memory module 302, for the sentence of each data warehouse operations sentence to be identified to the data warehouse with accessing
The corresponding relation of the table name of purpose table is stored in mapping table;
First acquisition module 303, for according to mapping table, obtaining each data warehouse purpose in mapping table
The table name of the corresponding data warehouse source table of table;
Module 304 is built, it is corresponding with each data warehouse purpose table for the table name according to each data warehouse purpose table
Data warehouse originate table table name, build data warehouse table genetic connection figure.
Further, parsing module 301, including:
First resolution unit, each data warehouse operations sentence of data warehouse is accessed for parsing, and obtains accessing data
The corresponding access mode of each data warehouse operations sentence in warehouse;
Acquiring unit, for obtaining the data warehouse operations sentence that access mode is WriteMode;
Second resolution unit, for parsing the data warehouse operations sentence that access mode is WriteMode, obtains access mode
For the table name of all data warehouse purpose tables of the access of WriteMode.
Further, the device also includes:
Second acquisition module, is the data warehouse operations sentence of lead-in type and corresponding for obtaining task type
Lead-in path;
3rd acquisition module, for obtaining task type for analysis type and with the lead-in path according to the lead-in path
Data warehouse operations sentence;
Binding module, is analysis for data warehouse operations sentence and task type that binding task type is lead-in type
Type and the data warehouse operations sentence with lead-in path.
Further, the first acquisition module 303, including:
First acquisition unit, for for every record in mapping table, obtaining the data warehouse stored in record
The sentence mark and the table name of data warehouse purpose table of action statement;
Second acquisition unit, for obtaining data warehouse operations sentence according to the sentence of acquisition mark;
3rd resolution unit, for parsing the data warehouse operations sentence obtained, obtains each data warehouse purpose table.
Further, module 304 is built, including:
Construction unit, in data warehouse table genetic connection figure, the table name for building data warehouse purpose table to be corresponding
Node, and build the corresponding node of table name of the corresponding data warehouse source table of data warehouse purpose table;
As unit, for the corresponding node of table name of data warehouse purpose table is corresponding as data warehouse purpose table
The child node of the corresponding node of table name of data warehouse source table.
Further, the device also includes:
Second memory module, for the data warehouse operations sentence for accessing data warehouse purpose table to be stored in into data warehouse
In the corresponding node of table name of purpose table;
Sending module, for data warehouse table genetic connection figure to be sent into terminal, user is shown to by terminal.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every
The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed
The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table
The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure
The speed of data warehouse table genetic connection and the degree of accuracy.
It should be noted that:The device for the structure data warehouse table genetic connection figure that above-described embodiment is provided is building data
, can be according to need only with the division progress of above-mentioned each functional module for example, in practical application during warehouse table genetic connection figure
Want and above-mentioned functions are distributed and completed by different functional modules, i.e., the internal structure of device is divided into different function moulds
Block, to complete all or part of function described above.In addition, the structure data warehouse table blood relationship that above-described embodiment is provided is closed
It is that the device of figure and the embodiment of the method for structure data warehouse table genetic connection figure belong to same design, it is detailed that it implements process
See embodiment of the method, repeat no more here.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware
To complete, the hardware of correlation can also be instructed to complete by program, described program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.
Claims (8)
1. a kind of method for building data warehouse table genetic connection figure, it is characterised in that methods described includes:
Parsing accesses each data warehouse operations sentence of data warehouse, obtains what each data warehouse operations sentence was accessed
The table name of data warehouse purpose table;
Obtain data warehouse operations sentence and corresponding lead-in path that task type is lead-in type;
Task type is obtained according to the lead-in path to be analysis type and there is the data warehouse operations language of the lead-in path
Sentence;
It is analysis type and tool to bind data warehouse operations sentence and the task type that the task type is lead-in type
There is the data warehouse operations sentence of the lead-in path;
The sentence for the data warehouse operations sentence that the task type is lead-in type is identified and the task type is analysis
Type and with the lead-in path data warehouse operations sentence sentence mark with access data warehouse purpose table table
The corresponding relation of name is stored in mapping table;
For every record in the mapping table, the sentence of the data warehouse operations sentence stored in the record is obtained
The table name of mark and data warehouse purpose table;
Data warehouse operations sentence is obtained according to the sentence of acquisition mark;
The data warehouse operations sentence of the acquisition is parsed, the corresponding data warehouse of each data warehouse purpose table is obtained
The table name of source table;
According to the table name of each data warehouse purpose table and the corresponding data warehouse source table of each data warehouse purpose table
Table name, build data warehouse table genetic connection figure.
2. the method as described in claim 1, it is characterised in that the parsing accesses each data warehouse operations of data warehouse
Sentence, obtains the table name for the data warehouse purpose table that each data warehouse operations sentence is accessed, including:
The parsing each data warehouse operations sentence for accessing data warehouse, obtains each data of the access data warehouse
The corresponding access mode of warehouse operation sentence;
Obtain the data warehouse operations sentence that access mode is WriteMode;
The data warehouse operations sentence that the access mode is WriteMode is parsed, the data that the access mode is WriteMode are obtained
The table name for all data warehouse purpose tables that warehouse operation sentence is accessed.
3. the method as described in claim 1, it is characterised in that the table name of each data warehouse purpose table of basis and described
Each the table name of the corresponding data warehouse source table of data warehouse purpose table, builds data warehouse table genetic connection figure, including:
In data warehouse table genetic connection figure, the corresponding node of table name of the data warehouse purpose table is built, and builds institute
State the corresponding node of table name of the corresponding data warehouse source table of data warehouse purpose table;
It regard the corresponding node of table name of the data warehouse purpose table as the corresponding data warehouse of the data warehouse purpose table
The child node of the corresponding node of table name of source table.
4. method as claimed in claim 3, it is characterised in that the table name of the structure data warehouse purpose table is corresponding
After node, methods described also includes:
The data warehouse operations sentence for accessing the data warehouse purpose table is stored in the table name of the data warehouse purpose table
In corresponding node;
The data warehouse table genetic connection figure is sent to terminal, user is shown to by the terminal.
5. a kind of device for building data warehouse table genetic connection figure, it is characterised in that described device includes:
Parsing module, each data warehouse operations sentence of data warehouse is accessed for parsing, each data warehouse is obtained
The table name for the data warehouse purpose table that action statement is accessed;
Second acquisition module, for obtaining the data warehouse operations sentence and corresponding importing that task type is lead-in type
Path;
3rd acquisition module, for obtaining task type for analysis type and with the lead-in path according to the lead-in path
Data warehouse operations sentence;
Binding module, be for binding data warehouse operations sentence and the task type that the task type is lead-in type
Analysis type and the data warehouse operations sentence with the lead-in path;
First memory module, for being the sentence mark of the data warehouse operations sentence of lead-in type and institute by the task type
State sentence mark and the number accessed that task type is analysis type and the data warehouse operations sentence with the lead-in path
It is stored according to the corresponding relation of the table name of warehouse purpose table in mapping table;
First acquisition module, for according to the mapping table, obtaining each data warehouse mesh in the mapping table
The corresponding data warehouse of table originate the table name of table;
Module is built, for the table name according to each data warehouse purpose table and the corresponding number of each data warehouse purpose table
According to the table name of warehouse source table, data warehouse table genetic connection figure is built;
First acquisition module, including:
First acquisition unit, for for every record in the mapping table, obtaining the data stored in the record
The sentence mark and the table name of data warehouse purpose table of warehouse operation sentence;
Second acquisition unit, for obtaining data warehouse operations sentence according to the sentence of acquisition mark;
3rd resolution unit, the data warehouse operations sentence for parsing the acquisition obtains each data warehouse purpose
The table name of the corresponding data warehouse source table of table.
6. device as claimed in claim 5, it is characterised in that the parsing module, including:
First resolution unit, each data warehouse operations sentence for parsing the access data warehouse, obtains the access
The corresponding access mode of each data warehouse operations sentence of data warehouse;
Acquiring unit, for obtaining the data warehouse operations sentence that access mode is WriteMode;
Second resolution unit, for parsing the data warehouse operations sentence that the access mode is WriteMode, obtains the access
The table name for all data warehouse purpose tables that mode accesses for the data warehouse operations sentence of WriteMode.
7. device as claimed in claim 5, it is characterised in that the structure module, including:
Construction unit, in data warehouse table genetic connection figure, the table name for building the data warehouse purpose table to be corresponding
Node, and build the corresponding node of table name of the corresponding data warehouse source table of the data warehouse purpose table;
As unit, for regarding the corresponding node of table name of the data warehouse purpose table as the data warehouse purpose table pair
The child node of the corresponding node of table name for the data warehouse source table answered.
8. device as claimed in claim 7, it is characterised in that described device also includes:
Second memory module, for the data warehouse operations sentence for accessing the data warehouse purpose table to be stored in into the data
In the corresponding node of table name of warehouse purpose table;
Sending module, for the data warehouse table genetic connection figure to be sent into terminal, user is shown to by the terminal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410072773.0A CN103902653B (en) | 2014-02-28 | 2014-02-28 | A kind of method and apparatus for building data warehouse table genetic connection figure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410072773.0A CN103902653B (en) | 2014-02-28 | 2014-02-28 | A kind of method and apparatus for building data warehouse table genetic connection figure |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103902653A CN103902653A (en) | 2014-07-02 |
CN103902653B true CN103902653B (en) | 2017-08-01 |
Family
ID=50993976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410072773.0A Active CN103902653B (en) | 2014-02-28 | 2014-02-28 | A kind of method and apparatus for building data warehouse table genetic connection figure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103902653B (en) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915390A (en) * | 2015-05-25 | 2015-09-16 | 广州精点计算机科技有限公司 | ETL data lineage query system and query method |
CN105868521A (en) * | 2015-12-14 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Data information processing method and apparatus |
CN106997369B (en) * | 2016-01-26 | 2020-11-24 | 阿里巴巴集团控股有限公司 | Data cleaning method and device |
CN107239458B (en) * | 2016-03-28 | 2021-01-29 | 阿里巴巴集团控股有限公司 | Method and device for calculating development object relationship based on big data |
CN108132957B (en) * | 2016-12-01 | 2021-09-10 | 中国移动通信有限公司研究院 | Database processing method and device |
CN110019384B (en) * | 2017-08-15 | 2023-06-27 | 阿里巴巴集团控股有限公司 | Method for acquiring blood edge data, method and device for providing blood edge data |
US10769165B2 (en) * | 2017-12-20 | 2020-09-08 | Sap Se | Computing data lineage across a network of heterogeneous systems |
CN108038248B (en) * | 2017-12-28 | 2021-11-26 | 携程计算机技术(上海)有限公司 | ETL dependency automatic identification method and system |
CN110019315A (en) * | 2018-06-19 | 2019-07-16 | 杭州数澜科技有限公司 | A kind of method and apparatus for the parsing of data blood relationship |
CN109614432B (en) * | 2018-12-05 | 2021-01-05 | 北京百分点信息科技有限公司 | System and method for acquiring data blood relationship based on syntactic analysis |
CN109669981A (en) * | 2018-12-21 | 2019-04-23 | 成都四方伟业软件股份有限公司 | Data relationship management method, device, data relationship acquisition methods and storage medium |
CN109857818B (en) * | 2019-02-03 | 2021-09-14 | 北京字节跳动网络技术有限公司 | Method and device for determining production relation, storage medium and electronic equipment |
CN110008291B (en) * | 2019-04-10 | 2022-03-11 | 北京字节跳动网络技术有限公司 | Data early warning method and device, storage medium and electronic equipment |
CN110232056B (en) * | 2019-05-21 | 2022-02-25 | 苏宁云计算有限公司 | Blood margin analysis method and tool of structured query language |
CN110795509B (en) * | 2019-09-29 | 2024-02-09 | 北京淇瑀信息科技有限公司 | Method and device for constructing index blood-margin relation graph of data warehouse and electronic equipment |
CN111125229B (en) * | 2019-12-24 | 2024-06-28 | 杭州数梦工场科技有限公司 | Data blood edge generation method and device and electronic equipment |
CN111694858A (en) * | 2020-04-28 | 2020-09-22 | 平安科技(深圳)有限公司 | Data blood margin analysis method, device, equipment and computer readable storage medium |
CN111639143B (en) * | 2020-06-05 | 2020-12-22 | 广州市玄武无线科技股份有限公司 | Data blood relationship display method and device of data warehouse and electronic equipment |
CN111782738B (en) * | 2020-08-14 | 2021-08-17 | 北京斗米优聘科技发展有限公司 | Method and device for constructing database table level blood relationship |
CN112231203A (en) * | 2020-09-28 | 2021-01-15 | 四川新网银行股份有限公司 | Data warehouse test analysis method based on blood relationship |
CN112434042A (en) * | 2020-12-03 | 2021-03-02 | 深圳市欢太科技有限公司 | Data relationship construction method and device, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101609473A (en) * | 2009-07-30 | 2009-12-23 | 金蝶软件(中国)有限公司 | A kind of method of Structured Query Language (SQL) of reconstruct report query and device |
CN101859303A (en) * | 2009-04-07 | 2010-10-13 | 中国移动通信集团湖北有限公司 | Metadata management method and management system |
CN102239458A (en) * | 2008-12-02 | 2011-11-09 | 起元技术有限责任公司 | Visualizing relationships between data elements |
US8468120B2 (en) * | 2010-08-24 | 2013-06-18 | International Business Machines Corporation | Systems and methods for tracking and reporting provenance of data used in a massively distributed analytics cloud |
CN103186541A (en) * | 2011-12-27 | 2013-07-03 | 阿里巴巴集团控股有限公司 | Generation method and device for mapping relationship |
-
2014
- 2014-02-28 CN CN201410072773.0A patent/CN103902653B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102239458A (en) * | 2008-12-02 | 2011-11-09 | 起元技术有限责任公司 | Visualizing relationships between data elements |
CN101859303A (en) * | 2009-04-07 | 2010-10-13 | 中国移动通信集团湖北有限公司 | Metadata management method and management system |
CN101609473A (en) * | 2009-07-30 | 2009-12-23 | 金蝶软件(中国)有限公司 | A kind of method of Structured Query Language (SQL) of reconstruct report query and device |
US8468120B2 (en) * | 2010-08-24 | 2013-06-18 | International Business Machines Corporation | Systems and methods for tracking and reporting provenance of data used in a massively distributed analytics cloud |
CN103186541A (en) * | 2011-12-27 | 2013-07-03 | 阿里巴巴集团控股有限公司 | Generation method and device for mapping relationship |
Non-Patent Citations (2)
Title |
---|
"数据仓库元数据的管理与运用";杨玢玢;《中国优秀硕士学位论文全文数据库 信息科技辑》;20111215;全文 * |
"面向疑点核实的数据路径追踪技术研究";衡铁刚;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120515;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103902653A (en) | 2014-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103902653B (en) | A kind of method and apparatus for building data warehouse table genetic connection figure | |
Malyshev et al. | Getting the most out of Wikidata: Semantic technology usage in Wikipedia’s knowledge graph | |
CN110291517B (en) | Query language interoperability in graph databases | |
CN106980669B (en) | Data storage and acquisition method and device | |
CN103810224B (en) | information persistence and query method and device | |
CN107038207A (en) | A kind of data query method, data processing method and device | |
US9146994B2 (en) | Pivot facets for text mining and search | |
US20180144061A1 (en) | Edge store designs for graph databases | |
CN110472068A (en) | Big data processing method, equipment and medium based on heterogeneous distributed knowledge mapping | |
CN106547766A (en) | A kind of data access method and device | |
US11216474B2 (en) | Statistical processing of natural language queries of data sets | |
CN101493820A (en) | Medicine Regulatory industry knowledge base platform and construct method thereof | |
CN106407303A (en) | Data storage method and apparatus, and data query method and apparatus | |
CN104699718A (en) | Method and device for rapidly introducing business data | |
CN103778133A (en) | Database object changing method and device | |
CN104021123A (en) | Method and system for data transfer | |
US10445370B2 (en) | Compound indexes for graph databases | |
CN102346747A (en) | Method for searching parameters in data model | |
CN102591855A (en) | Data identification method and data identification system | |
CN106933845A (en) | The method and apparatus that MDX inquires about effect are realized using SQL | |
CN108008936A (en) | A kind of data processing method, device and electronic equipment | |
CN103714086A (en) | Method and device used for generating non-relational data base module | |
CN108037967A (en) | A kind of menu loading method and electronic equipment based on more parent-child structures | |
CN110781183A (en) | Method and device for processing incremental data in Hive database and computer equipment | |
CN103455335A (en) | Multilevel classification Web implementation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder | ||
CP02 | Change in the address of a patent holder |
Address after: 519000 High-tech Zone, Zhuhai City, Guangdong Province, Unit 1, Fourth Floor C, Building A, Headquarters Base No. 1, Qianwan Third Road, Tangjiawan Town Patentee after: ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd. Address before: 519080 Zone B, 1st Floor, Convention Center, No. 1, Software Park Road, Tangjiawan Town, Zhuhai, Guangdong Patentee before: ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd. |