CN110287172B - Method for formatting HBase data - Google Patents
Method for formatting HBase data Download PDFInfo
- Publication number
- CN110287172B CN110287172B CN201910588013.8A CN201910588013A CN110287172B CN 110287172 B CN110287172 B CN 110287172B CN 201910588013 A CN201910588013 A CN 201910588013A CN 110287172 B CN110287172 B CN 110287172B
- Authority
- CN
- China
- Prior art keywords
- hbase
- cluster
- zookeeper
- root node
- hadoop
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for formatting HBase data, belongs to the field of data formatting, and solves the problems that in the prior art, the operation is complicated and the time consumption is long when the HBase data is formatted. According to the method, all services of the HBase cluster are stopped, and the Zookeeper and Hadoop on which the HBase cluster depends are kept in a normal running state; firstly deleting a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and deleting a root directory storing HBase data on a Hadoop and all child directories contained under the root directory on the HBase cluster; after deleting, all services of the HBase cluster are started to obtain the HBase in the initial state. The method is used for quickly formatting the HBase data.
Description
Technical Field
A method for formatting HBase data is used for rapidly formatting the HBase data, and belongs to the field of data formatting.
Background
Data formatting refers to deleting all data and metadata in the system, and restoring the system to an initial state. When the data in the system is no longer useful or the system state is abnormal, the system can be quickly restored to a clean and usable state by performing data formatting.
Zookeeper: the ZooKeeper is a distributed application coordination service of open source codes, is an open source implementation of Chubbby of Google, is an important component of Hadoop and HBase dependence, and is currently a top-level open source project of Apache communities. It is a software providing a consistency service for distributed applications, the provided functions include: configuration maintenance, domain name service, distributed synchronization, group service, etc.
Hadoop: hadoop contains a distributed file system HDFS and a distributed computing framework MapReduce, which is currently the top-level item of the Apache community. Hadoop is characterized by high fault tolerance and is designed to be deployed on inexpensive hardware, and it provides high throughput access to data of applications that fit applications with very large data sets.
HBase is a very popular distributed and array-oriented NoSQL database, is a top-level open-source project of Apache communities, and has application scenes mainly of massive data storage and fixed condition retrieval under high concurrency conditions. In the development test environment, when the data in the HBase is no longer useful or the HBase state is abnormal, by formatting the HBase data, an HBase in an initial state, i.e., an HBase without any data, can be obtained quickly. The operation of HBase depends on Zookeeper and Hadoop, the metadata of which is stored on Zookeeper, and the data is stored on Hadoop. The HBase itself does not provide a method or tool for formatting, no patent is retrieved in the published patent regarding formatting the HBase, nor is there a detailed description of a method of formatting the HBase on the internet similar to that described herein. One solution that can easily be thought of and achieve the same purpose is to uninstall the original HBase cluster, namely, need to delete all data, metadata, software packages, configuration files and the like of the HBase, and to re-build a set of brand-new HBase clusters (need to reinstall the software packages and the configuration files in each node of the HBase cluster), but the operation of the method is complicated and takes a long time.
Disclosure of Invention
Aiming at the problems of the research, the invention aims to provide a method for formatting HBase data, which solves the problems of complicated operation and long time consumption in the prior art that a set of brand new HBase clusters are rebuilt to format the HBase data by unloading the original HBase clusters.
In order to achieve the above purpose, the invention adopts the following technical scheme:
a method of formatting HBase data, comprising the steps of:
s1, stopping all services of an HBase cluster, and simultaneously keeping a Zookeeper and Hadoop on which the HBase cluster depends in a normal running state;
s2, after the step S1 is executed, firstly deleting a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and then deleting a root directory storing HBase data on Hadoop and all child directories contained under the root directory on the HBase cluster; or deleting the root directory storing the HBase data on the Hadoop and all sub-directories contained in the root directory on the HBase cluster, and deleting the root node storing the HBase metadata on the Zookeeper and all sub-nodes contained in the root node on the HBase cluster; or firstly deleting the root node storing the HBase metadata on the Zookeeper and all sub-nodes contained under the root node on the HBase cluster, and simultaneously deleting the root directory storing the HBase data on the Hadoop and all sub-directories contained under the root directory on the HBase cluster;
and S3, after deleting, starting all services of the HBase cluster, and obtaining the HBase in an initial state.
Further, in the step S2,
the specific implementation process of deleting the root node storing the HBase metadata on the Zookeeper on the HBase cluster and all the child nodes contained under the root node is as follows: the method comprises the steps that a root node storing HBase metadata on a Zookeeper is found in a Zookeeper tag of a configuration file HBase-site.xml of an HBase cluster, and after the root node and all child nodes contained under the root node are deleted on the Zookeeper;
the specific implementation process for deleting the root directory storing the HBase data on the Hadoop on the HBase cluster and all subdirectories contained under the root directory comprises the following steps: and finding a root directory storing HBase data on the Hadoop in a HBase-site.xml HBase. Rootdir tag of a configuration file HBase cluster, and deleting the root directory and all subdirectories contained in the root directory on the Hadoop after finding.
Further, the processor receives a request for formatting HBase data, stops all services of the HBase cluster, and simultaneously keeps the Zookeeper and Hadoop on which the HBase cluster depends in a normal running state;
then, the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and then deletes a root directory storing HBase data on a Hadoop and all child directories contained under the root directory on the HBase cluster; or the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root directory storing HBase data on Hadoop and all subdirectories contained under the root directory on the HBase cluster, and then deletes a root node storing HBase metadata on a Zookeeper and all subdirectories contained under the root node on the HBase cluster; or the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root node storing HBase metadata on the Zookeeper and all child nodes contained under the root node on the HBase cluster, and simultaneously deletes a root directory storing HBase data on the Hadoop and all child directories contained under the root directory on the HBase cluster;
after deleting, the processor starts all services of the HBase cluster, and then the HBase in an initial state is obtained.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the method, all metadata stored on the Zookeeper and data stored on the Hadoop by the HBase cluster are deleted, so that the implementation steps are simplified, the complexity of operation is reduced, the formatting of HBase data is realized rapidly, and the optimal solution of processing the internal object by the computer is realized.
Drawings
FIG. 1 is a flow chart of deleting all metadata stored on a Zookeeper in the present invention, and then deleting data stored on Hadoop.
Detailed Description
The invention will be further described with reference to the drawings and detailed description.
A method of formatting HBase data, comprising the steps of:
s1, stopping all services of an HBase cluster, and simultaneously keeping a Zookeeper and Hadoop on which the HBase cluster depends in a normal running state;
s2, after the step S1 is executed, firstly deleting a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and then deleting a root directory storing HBase data on Hadoop and all child directories contained under the root directory on the HBase cluster; or deleting the root directory storing the HBase data on the Hadoop and all sub-directories contained in the root directory on the HBase cluster, and deleting the root node storing the HBase metadata on the Zookeeper and all sub-nodes contained in the root node on the HBase cluster; or firstly deleting the root node storing the HBase metadata on the Zookeeper and all sub-nodes contained under the root node on the HBase cluster, and simultaneously deleting the root directory storing the HBase data on the Hadoop and all sub-directories contained under the root directory on the HBase cluster;
the specific implementation process of deleting the root node storing the HBase metadata on the Zookeeper on the HBase cluster and all the child nodes contained under the root node is as follows: the method comprises the steps that a root node storing HBase metadata on a Zookeeper is found in a Zookeeper tag of a configuration file HBase-site.xml of an HBase cluster, and after the root node and all child nodes contained under the root node are deleted on the Zookeeper;
the specific implementation process for deleting the root directory storing the HBase data on the Hadoop on the HBase cluster and all subdirectories contained under the root directory comprises the following steps: and finding a root directory storing HBase data on the Hadoop in a HBase-site.xml HBase. Rootdir tag of a configuration file HBase cluster, and deleting the root directory and all subdirectories contained in the root directory on the Hadoop after finding.
In the searching and deleting process, a manual mode is adopted to search in HBase cluster configuration files HBase-site.xml and delete according to the searching result, namely, corresponding content is found through naked eye checking and deleting instructions are given for deleting; or after receiving the searching instruction through the program, automatically searching in HBase-site.xml of the configuration file HBase cluster and deleting according to the searching result, wherein the program for searching the root node storing the HBase metadata on the Zookeeper: namely, an XML parsing program (such as a common XML parsing library such as a DOM4J is called) is written, a value of a < value > </value > tag corresponding to a < name > zookeeper/parent > tag is found out from hbase-site. Searching a root directory storing HBase data on Hadoop: an XML analysis program (such as a common XML analysis library such as a DOM4J is called) is written, a value of a < value > </value > mark corresponding to a < name > hbase. Rootdir </name > mark is found out from hbase-site.xml, and then the program is executed to search; the deleting procedure is as follows: the node deleted on the Zookeeper can adopt zkCli.sh script, java API of the deleted node of the Zookeeper or other language API, etc.; the directory on Hadoop can be deleted by using commands of hdfs dfs-rm-r < directory > or Hadoop fs-rm-r < directory > which are both carried by the Hadoop, or Java APIs of the Hadoop for deleting the directory or APIs of other languages.
And S3, after deleting, starting all services of the HBase cluster, and obtaining the HBase in an initial state.
The data stream that implements the formatting is as follows:
the processor receives a request for formatting HBase data, stops all services of the HBase cluster, and simultaneously keeps the Zookeeper and Hadoop relied on by the HBase cluster in a normal running state;
the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and then deletes a root directory storing HBase data on a Hadoop and all child directories contained under the root directory on the HBase cluster; or the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root directory storing HBase data on Hadoop and all subdirectories contained under the root directory on the HBase cluster, and then deletes a root node storing HBase metadata on a Zookeeper and all subdirectories contained under the root node on the HBase cluster; or the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root node storing HBase metadata on the Zookeeper and all child nodes contained under the root node on the HBase cluster, and simultaneously deletes a root directory storing HBase data on the Hadoop and all child directories contained under the root directory on the HBase cluster;
after deleting, the processor starts all services of the HBase cluster, and then the HBase in an initial state is obtained.
The above is merely representative examples of numerous specific applications of the present invention and should not be construed as limiting the scope of the invention in any way. All technical schemes formed by adopting transformation or equivalent substitution fall within the protection scope of the invention.
Claims (2)
1. A method of formatting HBase data, comprising the steps of:
s1, stopping all services of an HBase cluster, and simultaneously keeping a Zookeeper and Hadoop on which the HBase cluster depends in a normal running state;
s2, after the step S1 is executed, firstly deleting a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and then deleting a root directory storing HBase data on Hadoop and all child directories contained under the root directory on the HBase cluster; or deleting the root directory storing the HBase data on the Hadoop and all sub-directories contained in the root directory on the HBase cluster, and deleting the root node storing the HBase metadata on the Zookeeper and all sub-nodes contained in the root node on the HBase cluster; or firstly deleting the root node storing the HBase metadata on the Zookeeper and all sub-nodes contained under the root node on the HBase cluster, and simultaneously deleting the root directory storing the HBase data on the Hadoop and all sub-directories contained under the root directory on the HBase cluster;
the specific implementation process of deleting the root node storing the HBase metadata on the Zookeeper on the HBase cluster and all the child nodes contained under the root node is as follows: the method comprises the steps that a root node storing HBase metadata on a Zookeeper is found in a Zookeeper label of a configuration file hbae-se-s i t e.x m l of an H Ba s e cluster, and after the root node and all child nodes contained under the root node are deleted on the Zookeeper; the specific implementation process for deleting the root directory storing the HBase data on the Hadoop on the HBase cluster and all subdirectories contained under the root directory comprises the following steps: finding a root directory storing HBase data on the Hadoop in a HBase-site.xml HBase database tag of the HBase cluster, and deleting the root directory and all subdirectories contained in the root directory on the Hadoop after finding;
and S3, after deleting, starting all services of the HBase cluster, and obtaining the HBase in an initial state.
2. The method for formatting HBase data according to claim 1, wherein the processor receives a request for formatting HBase data, stops all services of the HBase cluster, and simultaneously keeps the Zookeeper and Hadoop on which the HBase cluster depends still in a normal operation state; then, the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root node storing HBase metadata on a Zookeeper and all child nodes contained under the root node on the HBase cluster, and then deletes a root directory storing HBase data on a Hadoop and all child directories contained under the root directory on the HBase cluster; or the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root directory storing HBase data on Hadoop and all subdirectories contained under the root directory on the HBase cluster, and then deletes a root node storing HBase metadata on a Zookeeper and all subdirectories contained under the root node on the HBase cluster; or the processor calls a query and deletion program in a memory according to a query deletion instruction, firstly deletes a root node storing HBase metadata on the Zookeeper and all child nodes contained under the root node on the HBase cluster, and simultaneously deletes a root directory storing HBase data on the Hadoop and all child directories contained under the root directory on the HBase cluster; after deleting, the processor starts all services of the HBase cluster, and then the HBase in an initial state is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910588013.8A CN110287172B (en) | 2019-07-01 | 2019-07-01 | Method for formatting HBase data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910588013.8A CN110287172B (en) | 2019-07-01 | 2019-07-01 | Method for formatting HBase data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110287172A CN110287172A (en) | 2019-09-27 |
CN110287172B true CN110287172B (en) | 2023-05-02 |
Family
ID=68021634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910588013.8A Active CN110287172B (en) | 2019-07-01 | 2019-07-01 | Method for formatting HBase data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287172B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113591143A (en) * | 2021-07-07 | 2021-11-02 | 四川新网银行股份有限公司 | Control method for limiting client IP reading and writing HBase table |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105468735A (en) * | 2015-11-23 | 2016-04-06 | 武汉虹旭信息技术有限责任公司 | Stream preprocessing system and method based on mass information of mobile internet |
CN109271365A (en) * | 2018-09-19 | 2019-01-25 | 浪潮软件股份有限公司 | Method for accelerating reading and writing of HBase database based on Spark memory technology |
CN109299068A (en) * | 2018-08-31 | 2019-02-01 | 安徽四创电子股份有限公司 | From relevant database to the data flow migration method of HBase database |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9172608B2 (en) * | 2012-02-07 | 2015-10-27 | Cloudera, Inc. | Centralized configuration and monitoring of a distributed computing cluster |
US9842126B2 (en) * | 2012-04-20 | 2017-12-12 | Cloudera, Inc. | Automatic repair of corrupt HBases |
-
2019
- 2019-07-01 CN CN201910588013.8A patent/CN110287172B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105468735A (en) * | 2015-11-23 | 2016-04-06 | 武汉虹旭信息技术有限责任公司 | Stream preprocessing system and method based on mass information of mobile internet |
CN109299068A (en) * | 2018-08-31 | 2019-02-01 | 安徽四创电子股份有限公司 | From relevant database to the data flow migration method of HBase database |
CN109271365A (en) * | 2018-09-19 | 2019-01-25 | 浪潮软件股份有限公司 | Method for accelerating reading and writing of HBase database based on Spark memory technology |
Non-Patent Citations (2)
Title |
---|
基于Hadoop的云平台设计与实现;秦东霞;《智能计算机与应用》;20160828;全文 * |
基于Hadoop系统大数据平台在天津市地震局的应用;丁晶;《电子技术与软件工程》;20170927;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110287172A (en) | 2019-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11836151B2 (en) | Synchronizing symbolic links | |
CN104951474B (en) | Method and device for acquiring MySQL binlog incremental log | |
JP5961689B2 (en) | Incremental data extraction | |
JP2022095645A (en) | System and method for capture of change data from distributed data sources, for use with heterogeneous targets | |
US9715507B2 (en) | Techniques for reconciling metadata and data in a cloud storage system without service interruption | |
US8938430B2 (en) | Intelligent data archiving | |
US8442951B1 (en) | Processing archive content based on hierarchical classification levels | |
CN106933703B (en) | Database data backup method and device and electronic equipment | |
KR101127304B1 (en) | Hsm two-way orphan reconciliation for extremely large file systems | |
CN113986873B (en) | Method for processing, storing and sharing data modeling of mass Internet of things | |
CN103595797B (en) | Caching method for distributed storage system | |
US8874519B1 (en) | Method and apparatus for restoring a table in a database | |
US20140156603A1 (en) | Method and an apparatus for splitting and recovering data in a power system | |
US10747643B2 (en) | System for debugging a client synchronization service | |
JP2020057416A (en) | Method and device for processing data blocks in distributed database | |
US10606805B2 (en) | Object-level image query and retrieval | |
US9646016B2 (en) | Methods circuits apparatuses systems and associated computer executable code for data deduplication | |
CN110287172B (en) | Method for formatting HBase data | |
US20100293143A1 (en) | Initialization of database for synchronization | |
US11210212B2 (en) | Conflict resolution and garbage collection in distributed databases | |
CN114036226A (en) | Data synchronization method, device, equipment and storage medium | |
WO2011051098A1 (en) | Synchronizing database and non-database resources | |
US20150347402A1 (en) | System and method for enabling a client system to generate file system operations on a file system data set using a virtual namespace | |
CN111026764B (en) | Data storage method and device, electronic product and storage medium | |
US11055266B2 (en) | Efficient key data store entry traversal and result generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |