CN114185865A - Large-scale base station data storage and analysis method and system based on distributed storage - Google Patents
Large-scale base station data storage and analysis method and system based on distributed storage Download PDFInfo
- Publication number
- CN114185865A CN114185865A CN202111515352.7A CN202111515352A CN114185865A CN 114185865 A CN114185865 A CN 114185865A CN 202111515352 A CN202111515352 A CN 202111515352A CN 114185865 A CN114185865 A CN 114185865A
- Authority
- CN
- China
- Prior art keywords
- base station
- station data
- distributed storage
- storage system
- calling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 27
- 238000013500 data storage Methods 0.000 title claims abstract description 25
- 238000006243 chemical reaction Methods 0.000 claims abstract description 12
- 238000005516 engineering process Methods 0.000 claims abstract description 12
- 238000007405 data analysis Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000000034 method Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 description 9
- 238000007726 management method Methods 0.000 description 6
- 230000008447 perception Effects 0.000 description 4
- 230000004927 fusion Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/178—Techniques for file synchronisation in file systems
- G06F16/1794—Details of file format conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a large-scale base station data storage and analysis method based on distributed storage, which comprises the steps of storing base station data, storing streaming base station data in a distributed storage system according to a base station and time; base station data conversion, merging base station data small files in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting an RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage; and analyzing the base station data, receiving the messages in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file, outputting an analysis result and storing the analysis result into a database. The invention solves the problems of poor expansibility of a base station data storage level and low data conversion efficiency.
Description
Technical Field
The invention relates to the field of base station data processing, in particular to a large-scale base station data storage and analysis method and system based on distributed storage.
Background
Today in the information era, data scales of various industries are increased explosively, big data is a new data processing mode, and the big data has stronger decision-making power, insight discovery power and circulation optimization capability to adapt to massive, high-growth rate and diversified information assets. The strategic significance of the big data is not only the management of mass data, but also the specialized processing of the mass significant data, the exploration of data value and the growth of data-driven enterprises. The mainstream big data technology in the market at present is a series of technical systems related to functions of acquisition, storage, calculation, query, resource management and the like based on a Hadoop ecosphere. The following are several more important technical frameworks:
HDFS (Hadoop Distributed File System) is the basis for data storage management in the Hadoop System. It is a highly fault tolerant system that can detect and cope with hardware faults for running on low cost general purpose hardware. YARN is the latest resource management system of Hadoop. Multiple jobs responsible for multiple applications may run simultaneously.
Spark and Flink are two general distributed data calculation engines, and can perform data streaming processing and batch processing, the difference is that Spark pays more attention to batch processing, and meanwhile, stream processing is regarded as special batch processing, and the concept of Flink is a concept of stream-batch integration, so that real streaming processing can be realized.
Big data is applied to maps more and more widely, and generally relates to high-precision maps and high-precision positioning. The high-precision map is a popular electronic map with higher precision and more data dimensions, can provide over-the-horizon road conditions, lane lines in roads, traffic signs and environmental information for vehicle perception, helps an automatic driving automobile to realize lane-level planning decision, realizes high-precision local positioning in a map matching mode, and reduces the dependence of automatic driving on expensive sensors. High-precision positioning is carried out, and various high-precision positioning methods are adopted in the automatic driving industry according to different requirements of scenes and positioning performance. In most car networking application scenarios, it is usually necessary to realize accurate positioning through fusion of multiple technologies, including gnss (global Navigation Satellite system), radio (e.g., cellular network, local area network, etc.), RTK (Real-time Kinematic), Inertial Measurement Unit (IMU), and sensor. Among them, the most widely used positioning method is the fusion GNSS/RTK & IMU. The GNSS/RTK positioning precision is high, centimeter-level precision can be achieved under dynamic measurement, the problem of accumulative error does not exist, the positioning updating frequency is low, and signals are easily shielded; the IMU update frequency is high, but there is an accumulative error problem. The method combines the two to realize advantage complementation, uses IMU to accumulate displacement and direction variables in a GNSS/RTK positioning interval, and a user receiver receives a correction number sent by a reference station while carrying out GNSS observation and corrects the positioning result, thereby improving the positioning precision and realizing the real-time positioning with low delay, high precision and high frequency.
The high-precision map is combined with a high-precision positioning technology to solve the difficulty of perception and application level. On the perception level, when the automatic driving reaches L3 or above, the requirements on precision and stability are higher and higher. In extreme scenarios such as inclement weather, the cognitive perception of lidar and vision sensors may be affected. The high-precision map simultaneously contains real-time road condition information and an original 3D model, the problem that a sensor is not suitable in rainy, snowy and foggy weather in a sensing link can be solved, geographic data are corrected in an interactive decision link, accuracy is improved, the number of vehicle-mounted sensors is greatly reduced, and the cost of the whole vehicle is reduced.
But now most of the storage of base station data is based on traditional database technology or hard disk. The defects are that the storage level expansibility is poor, the data is highly available and the safety is not high; the conversion of the original data of the base station is a basic dynamic library based on C + + at present, no packaging and distributed management is carried out, and the execution efficiency and reliability are ensured to be low; at present, a lot of software is available on the market for analyzing data quality monitoring in base station data integrity monitoring, but the software belongs to manually executed application programs, the automation degree is not high, and the software is not suitable for a large-scale high-precision positioning platform.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the problems of poor expansibility of a base station data storage level and low data conversion efficiency, the invention provides a large-scale base station data storage and analysis method and system based on distributed storage.
The technical scheme is as follows: a large-scale base station data storage and analysis method based on distributed storage comprises the following steps:
(1) storing base station data, namely storing streaming base station data in a distributed storage system according to a base station and time;
(2) base station data conversion, merging base station data small files in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting an RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage;
(3) and analyzing the base station data, receiving the messages in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file to obtain the availability of the base station data, the cycle slip ratio of the base station data and the multi-path numerical value of the frequency dividing point of the base station, and storing the analysis result into a database.
And the streaming base station data adopts a Flink streaming calculation engine.
The distributed storage system is an HDFS distributed storage system.
The message queue adopts a Kafka distributed publishing and subscribing message system.
A large-scale base station data storage and analysis system based on distributed storage comprises:
(1) a base station data storage module: storing streaming base station data in a distributed storage system according to base stations and time;
(2) a base station data conversion module: merging the small files of the base station data in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting the RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage;
(3) a base station data analysis module: receiving the message in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file, obtaining the availability of the base station data, the cycle slip ratio of the base station data and the multipath numerical value of the frequency dividing point of the base station, and storing the analysis result into a database.
And the streaming base station data adopts a Flink streaming calculation engine.
The distributed storage system is an HDFS distributed storage system.
The message queue adopts a Kafka distributed publishing and subscribing message system.
Has the advantages that: compared with the prior art, the invention has the following remarkable advantages:
1. for the problem of poor base station data storage level scalability: the storage of the base station data mainly adopts a distributed storage system HDFS, the storage system is based on Hadoop ecology, supports horizontal expansion, has a copy mechanism, can completely realize high availability of the data, and simultaneously can ensure the security and encryption of the data.
2. For the problem of conversion efficiency of base station data: firstly, calling a bottom layer C + + RTKLIB dynamic library by utilizing a JAVA JNI technology, and further packaging and carrying out distributed management in an upper layer JAVA program; meanwhile, parallel calling is carried out by using a Kafka message queue set distributed JAVA conversion program, so that the data of a plurality of base stations are converted at the same time, and the conversion efficiency is greatly improved.
3. The problem of poor automation of the analysis of the base station data is that: the execution program of the TEQC of the Linux version is packaged by the SHELL script, and the analysis script is called by using the Kafka message queue and the distributed JAVA program, so that the fully-automatic flow of data analysis is completely realized.
Drawings
Fig. 1 is a flowchart of a large-scale base station data storage and analysis method based on distributed storage.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
Example 1:
as shown in fig. 1, a large-scale base station data storage and analysis method based on distributed storage includes the following steps:
(1) storing base station data, namely storing streaming base station data in a distributed storage system according to a base station and time;
(2) base station data conversion, merging base station data small files in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting an RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage;
(3) and analyzing the base station data, receiving the messages in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file to obtain the availability of the base station data, the cycle slip ratio of the base station data and the multi-path numerical value of the frequency dividing point of the base station, and storing the analysis result into a database.
And the streaming base station data adopts a Flink streaming calculation engine.
The distributed storage system is an HDFS distributed storage system.
The message queue adopts a Kafka distributed publishing and subscribing message system.
Example 2:
a large-scale base station data storage and analysis system based on distributed storage comprises:
(1) a base station data storage module: storing streaming base station data in a distributed storage system according to base stations and time;
(2) a base station data conversion module: merging the small files of the base station data in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting the RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage;
(3) a base station data analysis module: receiving the message in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file, obtaining the availability of the base station data, the cycle slip ratio of the base station data and the multipath numerical value of the frequency dividing point of the base station, and storing the analysis result into a database.
And the streaming base station data adopts a Flink streaming calculation engine.
The distributed storage system is an HDFS distributed storage system.
The message queue adopts a Kafka distributed publishing and subscribing message system.
Claims (8)
1. A large-scale base station data storage and analysis method based on distributed storage is characterized by comprising the following steps:
(1) storing base station data, namely storing streaming base station data in a distributed storage system according to a base station and time;
(2) base station data conversion, merging base station data small files in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting an RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage;
(3) and analyzing the base station data, receiving the messages in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file to obtain the availability of the base station data, the cycle slip ratio of the base station data and the multi-path numerical value of the frequency dividing point of the base station, and storing the analysis result into a database.
2. The method for large-scale base station data storage and analysis based on distributed storage according to claim 1, wherein the streaming base station data adopts a Flink streaming calculation engine.
3. The method for large-scale base station data storage and resolution based on distributed storage according to claim 1, wherein the distributed storage system is an HDFS distributed storage system.
4. The distributed-storage-based large-scale base station data storage and analysis method according to claim 1, wherein the message queue adopts a Kafka distributed publish-subscribe message system.
5. A large-scale base station data storage and analysis system based on distributed storage is characterized by comprising:
(1) a base station data storage module: storing streaming base station data in a distributed storage system according to base stations and time;
(2) a base station data conversion module: merging the small files of the base station data in the distributed storage system, calling a C + + dynamic library of RTKLIB by using a JAVA local calling technology, converting the RTCM format file into a RINEX format file, and uploading the RINEX format file to the HDFS distributed storage system and the MinIO shared storage system for storage;
(3) a base station data analysis module: receiving the message in the message queue, analyzing the message content, calling a SHELL analysis script, downloading a RINEX file, calling GNSS data preprocessing software to analyze the RINEX file, obtaining the availability of the base station data, the cycle slip ratio of the base station data and the multipath numerical value of the frequency dividing point of the base station, and storing the analysis result into a database.
6. The system for large-scale base station data storage and resolution based on distributed storage according to claim 5, wherein said streaming base station data employs a Flink streaming computation engine.
7. The system for large-scale base station data storage and resolution based on distributed storage according to claim 5, wherein the distributed storage system is an HDFS distributed storage system.
8. The system for large-scale base station data storage and resolution based on distributed storage according to claim 5, wherein the message queue adopts a Kafka distributed publish-subscribe message system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111515352.7A CN114185865A (en) | 2021-12-13 | 2021-12-13 | Large-scale base station data storage and analysis method and system based on distributed storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111515352.7A CN114185865A (en) | 2021-12-13 | 2021-12-13 | Large-scale base station data storage and analysis method and system based on distributed storage |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114185865A true CN114185865A (en) | 2022-03-15 |
Family
ID=80543372
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111515352.7A Pending CN114185865A (en) | 2021-12-13 | 2021-12-13 | Large-scale base station data storage and analysis method and system based on distributed storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114185865A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118260254A (en) * | 2024-05-30 | 2024-06-28 | 武汉大学 | GNSS water vapor chromatography input data method, device, equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243160A (en) * | 2015-10-28 | 2016-01-13 | 西安美林数据技术股份有限公司 | Mass data-based distributed video processing system |
CN107544077A (en) * | 2017-08-31 | 2018-01-05 | 千寻位置网络(浙江)有限公司 | A kind of GNSS data quality testing analysis system and its analysis method |
CN108318905A (en) * | 2018-01-05 | 2018-07-24 | 北京北方联星科技有限公司 | The method of sub_meter position and sub- rice positioning intelligent mobile phone are realized on smart mobile phone |
US20180248771A1 (en) * | 2017-02-24 | 2018-08-30 | Ciena Corporation | Monitoring and auto-correction systems and methods for microservices |
CN109710731A (en) * | 2018-11-19 | 2019-05-03 | 北京计算机技术及应用研究所 | A kind of multidirectional processing system of data flow based on Flink |
CN111386477A (en) * | 2018-12-28 | 2020-07-07 | 深圳市大疆创新科技有限公司 | Observation data conversion method, equipment, movable platform and storage medium |
-
2021
- 2021-12-13 CN CN202111515352.7A patent/CN114185865A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243160A (en) * | 2015-10-28 | 2016-01-13 | 西安美林数据技术股份有限公司 | Mass data-based distributed video processing system |
US20180248771A1 (en) * | 2017-02-24 | 2018-08-30 | Ciena Corporation | Monitoring and auto-correction systems and methods for microservices |
CN107544077A (en) * | 2017-08-31 | 2018-01-05 | 千寻位置网络(浙江)有限公司 | A kind of GNSS data quality testing analysis system and its analysis method |
CN108318905A (en) * | 2018-01-05 | 2018-07-24 | 北京北方联星科技有限公司 | The method of sub_meter position and sub- rice positioning intelligent mobile phone are realized on smart mobile phone |
CN109710731A (en) * | 2018-11-19 | 2019-05-03 | 北京计算机技术及应用研究所 | A kind of multidirectional processing system of data flow based on Flink |
CN111386477A (en) * | 2018-12-28 | 2020-07-07 | 深圳市大疆创新科技有限公司 | Observation data conversion method, equipment, movable platform and storage medium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118260254A (en) * | 2024-05-30 | 2024-06-28 | 武汉大学 | GNSS water vapor chromatography input data method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8359156B2 (en) | Map generation system and map generation method by using GPS tracks | |
US11392733B2 (en) | Multi-dimensional event model generation | |
CN110412635A (en) | A kind of environment beacon support under GNSS/SINS/ vision tight integration method | |
CN113127590B (en) | Map updating method and device | |
US11408739B2 (en) | Location correction utilizing vehicle communication networks | |
CN114111775B (en) | Multi-sensor fusion positioning method and device, storage medium and electronic equipment | |
US20200209861A1 (en) | Distributed system execution using a serial timeline | |
US11397610B2 (en) | Architecture for simulation clock-based simulation of distributed systems | |
CN104637300B (en) | Highway traffic sign board informationization Analysis of check-up and display systems | |
US11847385B2 (en) | Variable system for simulating operation of autonomous vehicles | |
CN104780605A (en) | Terminal location method and terminal location device | |
EP4042108A1 (en) | Methods and systems using digital map data | |
Shoab et al. | High-precise true digital orthoimage generation and accuracy assessment based on UAV images | |
CN114185865A (en) | Large-scale base station data storage and analysis method and system based on distributed storage | |
Sahlholm | Distributed road grade estimation for heavy duty vehicles | |
CN115655257A (en) | High-precision map updating method, device, equipment and storage medium | |
WO2022067295A1 (en) | Architecture for distributed system simulation timing alignment | |
CN115236715A (en) | Fusion positioning method, fusion positioning device, fusion positioning medium and electronic equipment | |
US20220092231A1 (en) | Architecture for distributed system simulation timing alignment | |
Clausen et al. | Assessment of positioning accuracy of vehicle trajectories for different road applications | |
CN116451590B (en) | Simulation method and device of automatic driving simulation test platform | |
Simwanda et al. | Evaluating global positioning system accuracy for forest biomass transportation tracking within varying forest canopy | |
CN112965076A (en) | Multi-radar positioning system and method for robot | |
US11669657B2 (en) | Architecture for distributed system simulation with realistic timing | |
CN114969414B (en) | Map updating method and system and beyond-line-of-sight road condition cooperative method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220315 |