CN113934786A - Implementation method for constructing unified ETL - Google Patents
Implementation method for constructing unified ETL Download PDFInfo
- Publication number
- CN113934786A CN113934786A CN202111147981.9A CN202111147981A CN113934786A CN 113934786 A CN113934786 A CN 113934786A CN 202111147981 A CN202111147981 A CN 202111147981A CN 113934786 A CN113934786 A CN 113934786A
- Authority
- CN
- China
- Prior art keywords
- data
- etl
- page
- processing
- script
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3684—Test management for test design, e.g. generating new test cases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an implementation method for constructing a unified ETL, which relates to the field of construction of decision analysis systems, and comprises the steps of analyzing a data internal logic structure of a data source, combing service rules according to a service target and a processing rule of data processing, analyzing and summarizing a data processing algorithm according to the service rule and the data source structure, designing data mapping, selecting a corresponding data processing algorithm by referring to data processing source table and target table basic metadata information related to initialization of basic mapping, developing the data rule, converting the service rule into an executable code expression according to data mapping content and the processing rule, generating an ETL script to be executed by using a script generator according to the data mapping content and the data processing algorithm, testing the script and adjusting the processing rule, and completing ETL construction.
Description
Technical Field
The invention discloses a method, relates to the field of construction of decision analysis systems, and particularly relates to an implementation method for constructing a unified ETL.
Background
With the vigorous development of the big data era, the construction quality requirement of the decision analysis system is continuously improved, and the high-efficiency and high-quality ETL system is directly related to the success or failure of the construction of the decision analysis system. In the implementation process of the system, data demand indexes are flexible and easy to change, business logic rules are complicated and fragmented and specific, but data developers lack unified technical specifications and frequent personnel flow, so that the decision analysis system is low in efficiency due to the fact that ETL implementation lacks quality control, and accuracy cannot be guaranteed.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an implementation method for constructing a unified ETL, which can uniformly design and realize ETL data processing in the construction and implementation of an OLAP system, save manpower development cost, improve development quality, unify development specifications and enable the data processing process to be more standard, efficient and unified.
The specific scheme provided by the invention is as follows:
an implementation method for constructing a unified ETL analyzes the data internal logic structure of a data source,
combing the business rules according to the business target and the processing rule of the data processing, analyzing and summarizing the data processing algorithm according to the business rules and the data source structure,
designing a data map, wherein the data processing source table and the target table basic metadata information involved in the initialization of the basic map are referenced, a corresponding data processing algorithm is selected,
developing data rules, wherein the business rules are converted into executable code expressions according to the data mapping content and the processing rules,
an ETL script to be executed is generated using a script generator according to the data mapping contents and the data processing algorithm,
and testing the script and adjusting the processing rule to complete the ETL construction.
Further, the method for constructing a unified ETL according to the embodiment, wherein the analyzing the data internal logic structure of the data source is preceded by:
and crawling a data source, wherein the crawling data source is used for acquiring only list page data of a webpage, or acquiring list page and page turning data, or acquiring list page, page turning and detail page data.
Further, the method for constructing a unified ETL according to the present invention includes the following steps:
judging whether the logical structures of the list page and the page turning data are consistent,
judging whether the repeated condition exists after the data are merged,
and analyzing the association relationship between the carding list page and the detail page or between the turning page and the detail page.
Further, the combing business rule in the implementation method for constructing the unified ETL includes:
the unique key of the data service is determined according to the unique requirements of the data internal logic structure and the service of the source data,
the integration mechanism of the unified view of the data is completed,
and ensuring the logic relevance among all service data.
Further, in the implementation method for constructing the unified ETL, the processing rule is adjusted according to the test result of the test script, the updated ETL script is generated according to the adjusted processing rule, and the ETL is iteratively optimized.
An implementation system for constructing unified ETL comprises an analysis and arrangement module and a generation module,
the analysis and sorting module analyzes the data internal logic structure of the data source,
combing the business rules according to the business target and the processing rule of the data processing, analyzing and summarizing the data processing algorithm according to the business rules and the data source structure,
designing a data map, wherein the data processing source table and the target table basic metadata information involved in the initialization of the basic map are referenced, a corresponding data processing algorithm is selected,
developing data rules, wherein the business rules are converted into executable code expressions according to the data mapping content and the processing rules,
the generation module generates an ETL script to be executed using the script generator according to the data mapping contents and the data processing algorithm,
and testing the script and adjusting the processing rule to complete the ETL construction.
Further, the implementation system for constructing the unified ETL further comprises a crawling module, wherein the crawling module crawls a data source before analyzing the data internal logic structure of the data source, and the crawling data source is used for only acquiring list page data of a webpage or acquiring list page and page turning data or acquiring list page, page turning and detail page data.
Further, the analyzing and sorting module in the implementation system for constructing the unified ETL analyzes the data internal logic structure of the data source, including:
judging whether the logical structures of the list page and the page turning data are consistent,
judging whether the repeated condition exists after the data are merged,
and analyzing the association relationship between the carding list page and the detail page or between the turning page and the detail page.
Further, the method for constructing the unified ETL in the implementation system for combing the business rules by the analysis and arrangement module comprises the following steps:
the unique key of the data service is determined according to the unique requirements of the data internal logic structure and the service of the source data,
the integration mechanism of the unified view of the data is completed,
and ensuring the logic relevance among all service data.
Further, the generation module in the implementation system for constructing the unified ETL adjusts the processing rule according to the test result of the test script, generates an updated ETL script according to the adjusted processing rule, and iteratively optimizes the ETL.
The invention has the advantages that:
the invention provides an implementation method for constructing a unified ETL, which ensures the unified carding of business logic rules, the unified design of data mapping, the unified development of processing rules and the unified generation of ETL scripts in the implementation process; the construction of a decision analysis system is finished with high quality by ensuring the unification of steps from business requirements to implementation of landing, quality inspection and development specifications.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
The invention provides an implementation method for constructing a unified ETL, which analyzes the data internal logic structure of a data source,
combing the business rules according to the business target and the processing rule of the data processing, analyzing and summarizing the data processing algorithm according to the business rules and the data source structure,
designing a data map, wherein the data processing source table and the target table basic metadata information involved in the initialization of the basic map are referenced, a corresponding data processing algorithm is selected,
developing data rules, wherein the business rules are converted into executable code expressions according to the data mapping content and the processing rules,
an ETL script to be executed is generated using a script generator according to the data mapping contents and the data processing algorithm,
and testing the script and adjusting the processing rule to complete the ETL construction.
The ETL data processing method and the ETL data processing system can design and realize ETL data processing in a unified mode in the construction and implementation of an OLAP system, labor development cost is saved, development quality is improved, development specifications are unified, and the data processing process is more standard and more efficient and unified.
In a specific application, when implementing the unified ETL in some embodiments of the present invention, the specific process is as follows:
analyzing the data internal logic structure of the data source, wherein before analyzing the data internal logic structure of the data source, crawling source data mainly comprises three situations by taking the example that the skyhook of enterprise data searches website web crawler data; firstly, the data only collects the list page data of the web page, secondly, the data collects the list page and page turning data, thirdly, the data collects the list page/page turning/detail page data,
when analyzing the data intrinsic logic structure of the data source, the method comprises the following steps:
judging whether the logic structures of the list page and the page turning data are consistent, if so, indicating that the two parts of data can be logically operated as a set UNION,
judging whether the repeated condition exists after the data are merged, properly deleting the repeated data,
analyzing the association relationship between the combing list page and the detail page or between the page turning page and the detail page, and combing and analyzing various relationships among clear data;
combing business rules, taking the example of looking at business data at the sky,
the data is processed with business uniqueness, namely, the unique key of the data business is determined according to the requirement of the internal logic structure of the data of the source data and the uniqueness on the business, the data is processed with deduplication operation,
the integration mechanism of the data unified view is completed, namely, each business module completes the integration of the data unified view, for example, the list page data is to be integrated with the detail page data uniformly,
ensuring the logic relevance among all the business data, if the business module data of the enterprise data needs to be checked by eyes, the main external key with the unified coding rule needs to be arranged;
according to the business rule and data source structure analysis summarization processing algorithm, taking the example of looking up enterprise data by sky eye, the data processing algorithm summarization P0 can be that only list page- > according to business logic main KEY deduplication- > company name standardization- > social unified credit code/KEY _ ID information supplement- > current date data is inserted into the target table; p1, including list page and page turning, where the page turning has company name- > page turning and list page merging- > removing weight according to business logic main KEY- > company name standardization- > social uniform credit code/KEY _ ID information supplement- > current date data is inserted into the target list; p3, list page/page turning/detail page- > list page and page turning and detail page association- > merge- > removing duplication according to business logic main KEY- > company name standardization- > social uniform credit code/KEY _ ID information supplement- > current date data is inserted into the target table;
designing data mapping, referring to the data processing source table and target table basic metadata information related to the initialization of the basic mapping, selecting corresponding data processing algorithm,
three data processing algorithms are summarized and generalized by taking the enterprise data searched by the sky eye as an example, the storage of the data processing algorithm selection information is also considered in the data mapping design process,
and the mapping fields of the data processing, most fields do the data corresponding without special processing,
adding customized information such as sky eye searching enterprise data according to a business rule, and marking a unique key in mapping according to the operation of carrying out deduplication operation on the data by the unique key of the business;
developing a data rule, wherein the service rule is converted into an executable code expression according to the data mapping content and the processing rule, for example, a plurality of field codes are required to be marked in the logic rule to generate a new ID, and then the data rule develops a specific SQL statement to be written as the input of a script generator, and finally an executable ETL script is generated;
the ETL script to be executed can be conveniently generated by using the script generator according to the selection of data mapping content and algorithm as input, the mapping information of data, the flow of data processing algorithm, the rule of data deduplication and the like are stored as the content of a knowledge base, the script generator can take the knowledge base as input and then output the executable ETL script, the system can be repeatedly and iteratively optimized, and the program script which accords with the actual business can be more conveniently generated by continuously inducing the algorithm, designing the data mapping and developing and optimizing the processing rule;
and performing script test, adjusting the processing rule, continuously modifying the development content of the processing rule in the script test, and finally obtaining a standard, accurate and uniform program to complete ETL construction.
The invention also provides an implementation system for constructing the unified ETL, which comprises an analysis and arrangement module and a generation module,
the analysis and sorting module analyzes the data internal logic structure of the data source,
combing the business rules according to the business target and the processing rule of the data processing, analyzing and summarizing the data processing algorithm according to the business rules and the data source structure,
designing a data map, wherein the data processing source table and the target table basic metadata information involved in the initialization of the basic map are referenced, a corresponding data processing algorithm is selected,
developing data rules, wherein the business rules are converted into executable code expressions according to the data mapping content and the processing rules,
the generation module generates an ETL script to be executed using the script generator according to the data mapping contents and the data processing algorithm,
and testing the script and adjusting the processing rule to complete the ETL construction.
The information interaction, execution process and other contents between the modules in the system are based on the same concept as the method embodiment of the present invention, and specific contents can be referred to the description in the method embodiment of the present invention, and are not described herein again.
Similarly, the system of the invention can uniformly design and realize ETL data processing in the construction and implementation of the OLAP system, thereby saving the manpower development cost, improving the development quality, unifying the development standard and ensuring that the data processing process is more standard, efficient and uniform.
It should be noted that not all steps and modules in the processes and system structures in the preferred embodiments are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by a plurality of physical entities, or some components in a plurality of independent devices may be implemented together.
The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.
Claims (10)
1. An implementation method for constructing a unified ETL is characterized in that the data internal logic structure of a data source is analyzed,
combing the business rules according to the business target and the processing rule of the data processing, analyzing and summarizing the data processing algorithm according to the business rules and the data source structure,
designing a data map, wherein the data processing source table and the target table basic metadata information involved in the initialization of the basic map are referenced, a corresponding data processing algorithm is selected,
developing data rules, wherein the business rules are converted into executable code expressions according to the data mapping content and the processing rules,
an ETL script to be executed is generated using a script generator according to the data mapping contents and the data processing algorithm,
and testing the script and adjusting the processing rule to complete the ETL construction.
2. An implementation method for constructing a unified ETL as claimed in claim 1, wherein said analyzing the data of the data source before the data internal logical structure comprises:
and crawling a data source, wherein the crawling data source is used for acquiring only list page data of a webpage, or acquiring list page and page turning data, or acquiring list page, page turning and detail page data.
3. An implementation method of constructing a unified ETL as claimed in claim 2, wherein said analyzing the data-internal logical structure of the data source comprises:
judging whether the logical structures of the list page and the page turning data are consistent,
judging whether the repeated condition exists after the data are merged,
and analyzing the association relationship between the carding list page and the detail page or between the turning page and the detail page.
4. The method of claim 1, wherein said combing business rules comprises:
the unique key of the data service is determined according to the unique requirements of the data internal logic structure and the service of the source data,
the integration mechanism of the unified view of the data is completed,
and ensuring the logic relevance among all service data.
5. The implementation method of claim 1, wherein the processing rules are adjusted according to the test result of the test script, the updated ETL script is generated according to the adjusted processing rules, and the ETL is iteratively optimized.
6. An implementation system for constructing unified ETL is characterized by comprising an analysis and arrangement module and a generation module,
the analysis and sorting module analyzes the data internal logic structure of the data source,
combing the business rules according to the business target and the processing rule of the data processing, analyzing and summarizing the data processing algorithm according to the business rules and the data source structure,
designing a data map, wherein the data processing source table and the target table basic metadata information involved in the initialization of the basic map are referenced, a corresponding data processing algorithm is selected,
developing data rules, wherein the business rules are converted into executable code expressions according to the data mapping content and the processing rules,
the generation module generates an ETL script to be executed using the script generator according to the data mapping contents and the data processing algorithm,
and testing the script and adjusting the processing rule to complete the ETL construction.
7. The system of claim 6, further comprising a crawling module that crawls the data sources before analyzing the logical structures in the data of the data sources, wherein the crawling module only collects the list page data of the web pages, or collects the list page and page turning data, or collects the list page, page turning and detail page data.
8. An implementation system for building a unified ETL as claimed in claim 7, wherein the parsing module parses the data internal logical structure of the data source, comprising:
judging whether the logical structures of the list page and the page turning data are consistent,
judging whether the repeated condition exists after the data are merged,
and analyzing the association relationship between the carding list page and the detail page or between the turning page and the detail page.
9. The system of claim 6, wherein the parsing module combs business rules, comprising:
the unique key of the data service is determined according to the unique requirements of the data internal logic structure and the service of the source data,
the integration mechanism of the unified view of the data is completed,
and ensuring the logic relevance among all service data.
10. The system of claim 6, wherein the generating module adjusts the processing rules according to the test result of the test script, generates an updated ETL script according to the adjusted processing rules, and iteratively optimizes the ETL.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111147981.9A CN113934786B (en) | 2021-09-29 | 2021-09-29 | Implementation method for constructing unified ETL |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111147981.9A CN113934786B (en) | 2021-09-29 | 2021-09-29 | Implementation method for constructing unified ETL |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113934786A true CN113934786A (en) | 2022-01-14 |
CN113934786B CN113934786B (en) | 2023-09-08 |
Family
ID=79277360
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111147981.9A Active CN113934786B (en) | 2021-09-29 | 2021-09-29 | Implementation method for constructing unified ETL |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113934786B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115080653A (en) * | 2022-08-23 | 2022-09-20 | 北京华御数观科技有限公司 | General model for data processing |
CN115858622A (en) * | 2022-12-12 | 2023-03-28 | 浙江大学 | Automatic generation method of business data checking script |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120030172A1 (en) * | 2010-07-27 | 2012-02-02 | Oracle International Corporation | Mysql database heterogeneous log based replication |
CN102915303A (en) * | 2011-08-01 | 2013-02-06 | 阿里巴巴集团控股有限公司 | Method and device for ETL (extract-transform-load) tests |
CN104915341A (en) * | 2014-03-10 | 2015-09-16 | 中国科学院沈阳自动化研究所 | Visual multi-database ETL integration method and system |
US9158827B1 (en) * | 2012-02-10 | 2015-10-13 | Analytix Data Services, L.L.C. | Enterprise grade metadata and data mapping management application |
CN105359141A (en) * | 2013-05-17 | 2016-02-24 | 甲骨文国际公司 | Supporting combination of flow based ETL and entity relationship based ETL |
CN107038177A (en) * | 2016-02-03 | 2017-08-11 | 维布络有限公司 | The method and apparatus for automatically generating extraction-conversion-loading code |
CN108959564A (en) * | 2018-07-04 | 2018-12-07 | 玖富金科控股集团有限责任公司 | Data warehouse metadata management method, readable storage medium storing program for executing and computer equipment |
CN109669983A (en) * | 2018-12-27 | 2019-04-23 | 杭州火树科技有限公司 | Visualize multi-data source ETL tool |
CN110019551A (en) * | 2017-12-19 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of Building Method of Data Warehouse and device |
CN111159266A (en) * | 2019-12-05 | 2020-05-15 | 江苏艾佳家居用品有限公司 | ETL task batch generation method based on metadata |
CN111324647A (en) * | 2020-01-21 | 2020-06-23 | 北京东方金信科技有限公司 | Method and device for generating ETL code |
CN111930819A (en) * | 2020-08-14 | 2020-11-13 | 工银科技有限公司 | ETL script generation method and device |
CN112817971A (en) * | 2021-01-21 | 2021-05-18 | 于克干 | Data processing method and system based on two-dimensional mapping table |
CN113051263A (en) * | 2019-12-26 | 2021-06-29 | 上海科技发展有限公司 | Metadata-based big data platform construction method, system, equipment and medium |
-
2021
- 2021-09-29 CN CN202111147981.9A patent/CN113934786B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120030172A1 (en) * | 2010-07-27 | 2012-02-02 | Oracle International Corporation | Mysql database heterogeneous log based replication |
CN102915303A (en) * | 2011-08-01 | 2013-02-06 | 阿里巴巴集团控股有限公司 | Method and device for ETL (extract-transform-load) tests |
US9158827B1 (en) * | 2012-02-10 | 2015-10-13 | Analytix Data Services, L.L.C. | Enterprise grade metadata and data mapping management application |
CN105359141A (en) * | 2013-05-17 | 2016-02-24 | 甲骨文国际公司 | Supporting combination of flow based ETL and entity relationship based ETL |
CN104915341A (en) * | 2014-03-10 | 2015-09-16 | 中国科学院沈阳自动化研究所 | Visual multi-database ETL integration method and system |
CN107038177A (en) * | 2016-02-03 | 2017-08-11 | 维布络有限公司 | The method and apparatus for automatically generating extraction-conversion-loading code |
CN110019551A (en) * | 2017-12-19 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of Building Method of Data Warehouse and device |
CN108959564A (en) * | 2018-07-04 | 2018-12-07 | 玖富金科控股集团有限责任公司 | Data warehouse metadata management method, readable storage medium storing program for executing and computer equipment |
CN109669983A (en) * | 2018-12-27 | 2019-04-23 | 杭州火树科技有限公司 | Visualize multi-data source ETL tool |
CN111159266A (en) * | 2019-12-05 | 2020-05-15 | 江苏艾佳家居用品有限公司 | ETL task batch generation method based on metadata |
CN113051263A (en) * | 2019-12-26 | 2021-06-29 | 上海科技发展有限公司 | Metadata-based big data platform construction method, system, equipment and medium |
CN111324647A (en) * | 2020-01-21 | 2020-06-23 | 北京东方金信科技有限公司 | Method and device for generating ETL code |
CN111930819A (en) * | 2020-08-14 | 2020-11-13 | 工银科技有限公司 | ETL script generation method and device |
CN112817971A (en) * | 2021-01-21 | 2021-05-18 | 于克干 | Data processing method and system based on two-dimensional mapping table |
Non-Patent Citations (2)
Title |
---|
苌程等: "基于ETL的金融数据集成过程模型", 《计算机工程与设计》, no. 09 * |
龚莎等: "基于Python的可配置自动化爬虫系统的设计与实现", 《电脑迷》, no. 10 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115080653A (en) * | 2022-08-23 | 2022-09-20 | 北京华御数观科技有限公司 | General model for data processing |
CN115858622A (en) * | 2022-12-12 | 2023-03-28 | 浙江大学 | Automatic generation method of business data checking script |
CN115858622B (en) * | 2022-12-12 | 2023-08-04 | 浙江大学 | Automatic generation method of business data checking script |
Also Published As
Publication number | Publication date |
---|---|
CN113934786B (en) | 2023-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7676453B2 (en) | Partial query caching | |
US8666969B2 (en) | Query rewrite for pre-joined tables | |
EP3671526B1 (en) | Dependency graph based natural language processing | |
US20190272478A1 (en) | Generating feature vectors from rdf graphs | |
CN109522341B (en) | Method, device and equipment for realizing SQL-based streaming data processing engine | |
US9928288B2 (en) | Automatic modeling of column and pivot table layout tabular data | |
CN111984659B (en) | Data updating method, device, computer equipment and storage medium | |
CN112860727B (en) | Data query method, device, equipment and medium based on big data query engine | |
CN113934786A (en) | Implementation method for constructing unified ETL | |
US11921763B2 (en) | Methods and systems to parse a software component search query to enable multi entity search | |
CN116975116A (en) | Data condition screening method of big data analysis system | |
CN110309214B (en) | Instruction execution method and equipment, storage medium and server thereof | |
CN110851514A (en) | ETL (extract transform and load) processing method based on FLINK (Linear rotation index) | |
CN116166718B (en) | Data blood margin acquisition method and device | |
CN110008448B (en) | Method and device for automatically converting SQL code into Java code | |
US20190095538A1 (en) | Method and system for generating content from search results rendered by a search engine | |
CN114861229A (en) | Hive dynamic desensitization method and system | |
CN111753045B (en) | Hive two-level full-text index technical method and system based on elastic search | |
Cheney | Provenance, XML and the scientific web | |
CN111079391B (en) | Report generation method and device | |
US11467752B2 (en) | Data migration system and data migration method | |
CN115545006B (en) | Rule script generation method, device, computer equipment and medium | |
CN110968634B (en) | Method for realizing ETL conversion processing by utilizing programmable function based on XML description in big data scene | |
CN111221846B (en) | Automatic translation method and device for SQL sentences | |
CN114089976B (en) | Method, apparatus, and medium for generating database operation statements |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |