CN102799617A - Construction and query optimization methods for multiple layers of Bloom Filters - Google Patents

Construction and query optimization methods for multiple layers of Bloom Filters Download PDF

Info

Publication number
CN102799617A
CN102799617A CN2012102028163A CN201210202816A CN102799617A CN 102799617 A CN102799617 A CN 102799617A CN 2012102028163 A CN2012102028163 A CN 2012102028163A CN 201210202816 A CN201210202816 A CN 201210202816A CN 102799617 A CN102799617 A CN 102799617A
Authority
CN
China
Prior art keywords
bloom filter
layer
data
bit position
individual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012102028163A
Other languages
Chinese (zh)
Other versions
CN102799617B (en
Inventor
曹强
黄建忠
谢长生
荣益麟
慎涵
黄国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201210202816.3A priority Critical patent/CN102799617B/en
Publication of CN102799617A publication Critical patent/CN102799617A/en
Application granted granted Critical
Publication of CN102799617B publication Critical patent/CN102799617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses construction and query optimization methods for multiple layers of Bloom Filters. During construction, bit positions of relevant Bloom Filters in the conventional multiple layers of Bloom Filters are rearranged; the bit positions of the first layer of Q Bloom Filters and the same bit positions of Q Bloom Filters of the lower layer, which correspond to each Bloom Filter of the upper layer, are put in the same continuous address space; during query, the bit positions of the Q Bloom Filters of the same layer, which correspond to a hash value, are positioned in the same continuous address space; and the multiple layers of Bloom Filters can be queried by querying a small number of continuous spaces. According to the multiple layers of optimized Bloom Filters, under the condition that the storage space is not increased, the bit position query operation is relatively easy, and the frequency for accessing a magnetic disk is greatly reduced; and the query time of the multiple layers of Bloom Filters is effectively shortened.

Description

The structure of multilayer Bloom Filter and enquiring and optimizing method
Technical field
The present invention relates to the Computer Storage field, specifically, relate to structure and the enquiring and optimizing method of multilayer Bloom Filter.
Background technology
Bloom filter is the binary vector data structure that was proposed in 1970 by Howard Bloom, can be used to judge whether an element is present in the set fast.Compared to methods such as hash, trees, Bloom Filter can guarantee the spatial locality of data set to be checked when depositing.Along with the growth of data set to be checked, data set can be split into the data set of several same low capacity, respectively corresponding Bloom Filter.Owing to will be inquired about each Bloom Filter successively by data query, until finding these data or poll-final, the query time of a plurality of Bloom Filter increases greatly.In order to quicken the query script of mass data collection, multilayer Bloom Filter is introduced.When the decision element of upper strata Bloom Filter does not exist, the Bloom Filter of its corresponding lower floor can no longer inquire about, and has reduced Bloom Filter inquiry times.
Fig. 2 is the structure organization of three layers of Bloom Filter, every layer of total equating of scale-of-two bit position that Bloom Filter comprises.Each Bloom Filter of i layer (1≤i < 3) is corresponding to 2 Bloom Filter of i+1 layer.
When a cryptographic hash is inquired about, judge respectively earlier whether it is 1 in the corresponding bit position of each Bloom Filter of ground floor, if 1, query hit, the Bloom Filter of lower floor that then this Bloom Filter is corresponding will continue inquiry.Like Fig. 1, the corresponding bit place value of 2 Bloom Filter of ground floor is 1, then need inquire about all Bloom Filter of these 2 Bloom Filter of ground floor correspondence in the second layer.For the Bloom Filter that does not hit, this cryptographic hash is not present in its corresponding data centralization, and its Bloom Filter corresponding to lower floor need not continue inquiry.
Corresponding Bloom Filter is 1 if will inquire about the bit place value of Bloom Filter in the inquiry second layer, query hit, and the Bloom Filter of lower floor that then this Bloom Filter is corresponding will continue inquiry.Like Fig. 1, the 2nd Bloom Filter hits in the second layer, then will continue to inquire about the Bloom Filter in corresponding the 3rd layer of this Bloom Filter.For the Bloom Filter that does not hit, its Bloom Filter corresponding to lower floor need not continue inquiry.Promptly need not inquire about corresponding to the 3rd layer Bloom Filter like the the 1st, the 3rd and the 4th Bloom Filter among Fig. 1.
In bottom Bloom Filter inquiry, be 1 when inquiring about the corresponding bit place value of Bloom Filter, hit, represent that then this cryptographic hash possibly exist in the corresponding data centralization of this Bloom Filter, gets this data set and inquires about.Whether like Fig. 1, the 3rd layer of the 3rd Bloom Filter hits, promptly get its this cryptographic hash of corresponding data set inquiry and exist.For the bottom Bloom Filter that does not hit, it need not be inquired about for data set.Like Fig. 1, except the corresponding data set of the 3rd layer of the 3rd Bloom Filter, other data sets all need not be inquired about.
Multilayer Bloom Filter will be navigated to different data sets by the inquiry cryptographic hash, significantly reduce the number of times of the inquiry of data, reduce the inquiry expense.
Yet, for the mass data collection, can be very big to multilayer Bloom Filter inquiry times, the inquiry of Bloom Filter becomes a bottleneck.Even when Bloom Filter scale surpasses memory size, can produce a large amount of disk access (IO, Input/Output).But this directly causes the time of element inquiry to surpass our tolerance range.
Summary of the invention
The object of the present invention is to provide structure and the enquiring and optimizing method of a kind of multilayer Bloom Filter, accelerate the query script of element.
A kind of multilayer Bloom Filter makes up optimization method, and it is first term that the number of each layer Bloom Filter consists of with ground floor Bloom Filter number, and common ratio and first term be all the Geometric Sequence of Q, satisfies Q N* M>=S, the number of plies N of Bloom Filter>=2, Q is the integral multiple of disk sector length, and S is the size of total data collection, and M is the data number of each Bloom Filter corresponding data collection in the N layer;
The bit position of the same position of ground floor Q Bloom Filter is placed on the same continuation address of disk space; The bit position of the same position among Q the Bloom Filter of the j+1 layer that m Bloom Filter of j layer is corresponding is placed on the same continuation address of disk space; The bit position sum of Q Bloom Filter of the j+1 layer that the bit figure place of m Bloom Filter of j layer is corresponding with it equates; J=1 ..., N-1;
This method is specially:
(A1) put i=1;
(A2) judge whether current multilayer Bloom Filter has comprised total data and concentrated all data, if, then finish, otherwise, step (3) got into;
(A3) receive new data;
(A4) judge whether i Bloom Filter of N layer corresponding data collection is full,, then get into step (5), otherwise get into step (A6) if data acquisition is full;
(A5) i=i+1 is set;
(A6) new data is carried out Hash, according to cryptographic hash N layer Bloom Filter is carried out set, set finishes to change over to step (A2);
Said set is carried out according to following mode:
In all corresponding continuous spaces of i Bloom Filter of N layer, choose the continuous space corresponding, with the bit position 1 that belongs to i Bloom Filter in this continuous space with cryptographic hash;
I Bloom Filter of N layer is corresponding to
Figure BDA00001785867700031
individual Bloom Filter of N-1 layer; In all corresponding continuous spaces of the individual Bloom Filter of this
Figure BDA00001785867700032
, choose the continuous space corresponding with cryptographic hash; With the bit position 1 that belongs to
Figure BDA00001785867700033
individual Bloom Filter in this continuous space,
Figure BDA00001785867700034
expression rounds up;
individual Bloom Filter of N-1 layer is corresponding to
Figure BDA00001785867700036
individual BloomFilter of N-2 layer; In all corresponding continuous spaces of the individual Bloom Filter of this
Figure BDA00001785867700037
, choose the continuous space corresponding, with the bit position 1 that belongs to individual Bloom Filter in this continuous space with cryptographic hash;
So repetitive operation is up to the corresponding bit position 1 with the corresponding continuous space of ground floor.
Querying method based on described multilayer Bloom Filter structure optimization method is specially:
(B1) initialization j=1;
(B2) use with Bloom Filter building process in identical hash function group treat data query and carry out Hash operation;
(B3) from pairing all the continuation address spaces of Q Bloom Filter of ground floor, choose the corresponding continuation address space of cryptographic hash with step (B2) gained; Step-by-step phase and computing are done in these continuation address spaces; Judge whether the bit position in this and the operation result is 0 entirely; If; Illustrate that data to be checked do not exist; Finish, otherwise get into step (B5);
(B4) for each group polling Bloom Filter of j layer; From all corresponding continuation address spaces of this group polling Bloom Filter, choose the corresponding continuation address space of cryptographic hash with step (B2) gained, step-by-step phase and computing are done in these continuation address spaces; Judge whether the bit position in each group and the operation result is 0 entirely, if, explain that data to be checked do not exist, finish, otherwise get into step (B5);
(B5) judge whether j equals number of plies N,, get into step (B7), otherwise get into step (B6) if equal;
(B7) be each bit position of 1 for every group with the operation result intermediate value, Q the Bloom Filter that chooses the corresponding j+1 layer of Bloom Filter under it forms a group polling Bloom Filter, puts j=j+1, changes step (B4) over to;
(B7) the corresponding data centralization data query of Bloom Filter under each group and operation result intermediate value are 1 bit position.
Technique effect of the present invention is embodied in:
When the present invention makes up; Bit position to the relevant Bloom Filter of each layer among the existing multilayer Bloom Filter reapposes, and the same position bit position of the Q of lower floor Bloom Filter of ground floor Q Bloom Filter and each Bloom Filter correspondence of upper strata is placed on same continuation address space; During inquiry, the corresponding bit position with Q Bloom Filter of layer of cryptographic hash is present in same continuation address space, can realize carrying out the inquiry of multilayer Bloom Filter through the inquiry to the minority continuous space.Multilayer Bloom Filter after the optimization of the present invention is on the basis that does not increase storage space; Corresponding bit position query manipulation is more easy; The relevant bit position information of central access Bloom Filter; Significantly reduced number of disk accesses, effectively reduced query time multilayer Bloom Filter.
Description of drawings
Fig. 1 is existing multilayer Bloom Filter organization chart.
Fig. 2 is the bit bit organization mode synoptic diagram of Bloom Filter, the bit bit organization mode of the existing Bloom Filter of Fig. 2 (a), and Fig. 2 (b) optimizes the bit bit organization mode of Bloom Filter for the present invention.
Fig. 3 is the building method process flow diagram of multilayer Bloom Filter of the present invention.
Fig. 4 is the querying method process flow diagram of multilayer Bloom Filter of the present invention.
Fig. 5 optimizes the query case synoptic diagram for multilayer Bloom Filter of the present invention.
Embodiment
The optimization method of multilayer Bloom Filter of the present invention mainly comprises establishment and the query script of multilayer Bloom Filter.
Fig. 2 (a) provides existing bit bit organization mode; Existing multilayer Bloom Filter; As, the corresponding W of lower floor the Bloom Filter of the Bloom Filter in upper strata (W is artificial the setting), all bit positions are continuous in physical address space among each Bloom Filter;
Fig. 2 (b) is a bit bit organization mode of the present invention; In the structure of multilayer Bloom Filter of the present invention; The bit position of the same position of ground floor Q Bloom Filter is placed in the same continuation address of the disk space; J (j=1 ..., N-1) among Q the Bloom Filter of the corresponding j+1 layer of m Bloom Filter of layer; The bit position of all Q Bloom Filter same positions is placed in the same continuation address of the disk space, and the bit position of Q Bloom Filter of the j+1 layer that the bit figure place of m Bloom Filter of j layer is corresponding with it sum equates.Continuation address space size is Q bit, and (1≤m≤Q) individual bit belongs to corresponding m Bloom Filter to m in k the continuation address space, and its value is the value of k the bit position of corresponding m Bloom Filter; For Q Bloom Filter of association, a cryptographic hash is corresponding to a continuous space.
Among the present invention, set is carried out according to following mode:
Promptly choose the continuous space corresponding in all continuous spaces of i Bloom Filter correspondence of N layer at bottom, with the bit position 1 that belongs to i Bloom Filter in this continuous space with cryptographic hash; I Bloom Filter of N layer is corresponding to
Figure BDA00001785867700061
individual Bloom Filter of N-1 layer; In all corresponding continuous spaces of the individual Bloom Filter of N-1 layer
Figure BDA00001785867700062
, choose the continuous space corresponding with cryptographic hash; With the bit position 1 that belongs to
Figure BDA00001785867700063
individual Bloom Filter in this continuous space,
Figure BDA00001785867700064
expression rounds up;
individual Bloom Filter of N-1 layer is corresponding to
Figure BDA00001785867700066
individual BloomFilter of N-2 layer; In all corresponding continuous spaces of the individual Bloom Filter of this
Figure BDA00001785867700067
of N-2 layer, choose the continuous space corresponding, with the bit position 1 that belongs to
Figure BDA00001785867700068
individual Bloom Filter in this continuous space with cryptographic hash;
So repeat up to the corresponding continuous space of ground floor corresponding bit position 1;
Below in conjunction with accompanying drawing the present invention is done further detailed explanation.
As shown in Figure 3, the building method of multilayer Bloom Filter of the present invention may further comprise the steps:
(1), confirms the data number M of each Bloom Filter corresponding data collection of number of plies N, first term Q and bottom of Bloom Filter according to the big or small S of total data collection; Wherein, each layer of multilayer Bloom Filter Bloom Filter number is to be first term with ground floor Bloom Filter number Q, and common ratio is all the Geometric Sequence of Q, guarantees Q N* M>=S, Q are the integral multiples of disk sector capacity, and the bit position sum that each layer Bloom Filter comprises equates.Put i=0;
Whether the structure of (2) judging multilayer Bloom Filter finishes is that current multilayer Bloom Filter has comprised total data and concentrates all data, then gets into step (7) if finish, otherwise gets into step (3);
(3) receive new data;
(4) judge whether i Bloom Filter of bottom corresponding data collection is full,, then get into step (5), otherwise get into step (6) if data acquisition has been expired (the data number of data set equals M);
(5) i=i+1 is set;
(6) new data is carried out Hash,, and each the N-1 layer above it carried out corresponding set, change step (2) over to the bit position 1 corresponding among i Bloom Filter of bottom Bloom Filter with cryptographic hash;
(7) multilayer Bloom Filter structure is accomplished.
As shown in Figure 4, the data enquire method process flow diagram of multilayer Bloom Filter of the present invention may further comprise the steps:
(1) initialization j=1;
(2) use with Bloom Filter building process in identical hash function group treat data query and carry out Hash operation;
(3) from pairing all the continuation address spaces of Q Bloom Filter of ground floor, choose the corresponding continuation address space of cryptographic hash with step (2) gained, step-by-step is done mutually and computing in these continuation address spaces, get into step (5);
(4) for each group polling Bloom Filter of j layer; From all corresponding continuation address spaces of this group polling Bloom Filter, choose the corresponding continuation address space of cryptographic hash with step (2) gained, step-by-step phase and computing are done in these continuation address spaces;
(5) judge whether the bit position in this and the operation result is 0 entirely, if, explain that data to be checked do not exist, get into step (9), otherwise get into step (6);
(6) judge whether j equals number of plies N,, get into step (8), otherwise get into step (7) if equal;
(7) be each bit position of 1 for every group with the operation result intermediate value, Q the Bloom Filter that chooses the corresponding j+1 layer of Bloom Filter under it forms a group polling Bloom Filter, puts j=j+1, changes step (4) over to;
(8) the corresponding data centralization data query of Bloom Filter under each group and operation result intermediate value are 1 bit position.
(9) poll-final;
Instance:
For memory capacity is the magnanimity data de-duplication system of 512TB, supposes that it heavily deletes based on the piece level, and block size is 4KB, the corresponding fingerprint of each piece, and the fingerprint number has 2 37Individual, 20 bytes of each fingerprint add other metadata informations, and a fingerprint item needs 32 bytes, the fingerprint base of total 4TB size; It fails to lay down in internal memory; When a new data block arrives, need to judge it whether and the data of having stored repeat, promptly whether this data block fingerprint identical with existing fingerprint;
In order to accelerate the fingerprint search procedure, the present invention has introduced multilayer Bloom Filter, and the error rate of supposing Bloom Filter is ten thousand/; Get 10 hash functions; Corresponding every layer of Bloom Filter size be up to being 320GB, the two-layer 640GB that is, and it also fails to lay down in internal memory; Need be placed in the disk, its inquiry promptly can cause disk access;
According to formula Q N* M>=S sets up two-layer Bloom Filter, and ground floor has 2 15Individual Bloom Filter is because common ratio is 2 15, the second layer has 2 30Individual Bloom Filter, the second layer are each Bloom Filter of bottom corresponding 2 7Individual fingerprint, i.e. Q=2 15, N=2, M=2 7, S=2 37Individual, satisfy formula;
According to Bloom Filter make of the present invention, continuation address space size is 2 15Bit is 4KB;
Like Fig. 5, suppose that new fingerprint obtains 3 different Hash values 1,2,10 through 10 hash functions.
Three cryptographic hash are corresponding to the the 1st, the 2nd, the 10th continuation address space among the ground floor Bloom Filter, and we get these 3 corresponding 4KB continuation address spaces, do and computing.
The 1st bit is respectively 1,1,0 in three continuous spaces, with the result be 0; The 2nd bit is respectively 0,0,0, with the result be 0; The 3rd bit position is respectively 1,1,1, with the result be 1; Everybody is 0 with the result for other.
The 3rd bit among continuous space and the result belongs to the 3rd Bloom Filter of ground floor, and value is its affiliated Bloom Filter query hit of 1 expression, because Bloom Filter is 2 layers, needing the corresponding following one deck of this Bloom Filter of inquiry be 2 of the second layer 15Individual Bloom Filter.
According to cryptographic hash; Get the the 1st, the 2nd, the 10th the continuation address space of the corresponding Bloom Filter of the second layer, get these 3 corresponding 4KB continuation address spaces, corresponding space is done and computing; The 1st bit is respectively 1,1,0 in three continuous spaces, with the result be 0; The 1st bit is respectively 1,1,1, with the result be 1; Everybody is 0 with the result for other.
This layer has been last one deck Bloom Filter, reads this and hits the corresponding data set of Bloom Filter, and promptly the second layer the 2nd * 2 15+ 2 pairing data sets of Bloom Filter.
If the bit position of this multilayer Bloom Filter all is stored in disk, the sum of this queried access disk is 6 times;
If according to traditional approach, be at ground floor 2 15Among the individual Bloom Filter, each Bloom Filter inquires about corresponding 3 bit positions, has done 3 * 2 like this 15The inquiry of individual bit position is with at least 2 15The disk access of individual 512Byte data, the second layer have done same 3 * 2 15The inquiry of individual bit position is with at least 2 15The disk access of individual 512Byte data, total at least 2 16Inferior disk access;
Existing Bloom Filter magnetic disc access times is for optimizing the access times about 2 of back disk 13Doubly.
Those skilled in the art will readily understand; The above is merely preferred embodiment of the present invention; Not in order to restriction the present invention, all any modifications of within spirit of the present invention and principle, being done, be equal to and replace and improvement etc., all should be included within protection scope of the present invention.

Claims (2)

1. a multilayer Bloom Filter makes up optimization method, and it is first term that the number of each layer Bloom Filter consists of with ground floor Bloom Filter number, and common ratio and first term be all the Geometric Sequence of Q, satisfies Q N* M>=S, the number of plies N of Bloom Filter>=2, Q is the integral multiple of disk sector length, and S is the size of total data collection, and M is the data number of each Bloom Filter corresponding data collection in the N layer;
The bit position of the same position of ground floor Q Bloom Filter is placed on the same continuation address of disk space; The bit position of the same position among Q the Bloom Filter of the j+1 layer that m Bloom Filter of j layer is corresponding is placed on the same continuation address of disk space; The bit position sum of Q Bloom Filter of the j+1 layer that the bit figure place of m Bloom Filter of j layer is corresponding with it equates; J=1 ..., N-1;
This method is specially:
(A1) put i=1;
(A2) judge whether current multilayer Bloom Filter has comprised total data and concentrated all data, if, then finish, otherwise, step (3) got into;
(A3) receive new data;
(A4) judge whether i Bloom Filter of N layer corresponding data collection is full,, then get into step (5), otherwise get into step (A6) if data acquisition is full;
(A5) i=i+1 is set;
(A6) new data is carried out Hash, according to cryptographic hash N layer Bloom Filter is carried out set, set finishes to change over to step (A2);
Said set is carried out according to following mode:
In all corresponding continuous spaces of i Bloom Filter of N layer, choose the continuous space corresponding, with the bit position 1 that belongs to i Bloom Filter in this continuous space with cryptographic hash; I Bloom Filter of N layer is corresponding to individual Bloom Filter of N-1 layer; In all corresponding continuous spaces of the individual Bloom Filter of this
Figure FDA00001785867600012
, choose the continuous space corresponding with cryptographic hash; With the bit position 1 that belongs to
Figure FDA00001785867600013
individual Bloom Filter in this continuous space, expression rounds up;
Figure FDA00001785867600021
individual Bloom Filter of N-1 layer is corresponding to
Figure FDA00001785867600022
individual BloomFilter of N-2 layer; In all corresponding continuous spaces of the individual Bloom Filter of this
Figure FDA00001785867600023
, choose the continuous space corresponding, with the bit position 1 that belongs to
Figure FDA00001785867600024
individual Bloom Filter in this continuous space with cryptographic hash;
So repetitive operation is up to the corresponding bit position 1 with the corresponding continuous space of ground floor.
2. make up the querying method of optimization method based on the described multilayer Bloom Filter of claim 1, be specially:
(B1) initialization j=1;
(B2) use with Bloom Filter building process in identical hash function group treat data query and carry out Hash operation;
(B3) from pairing all the continuation address spaces of Q Bloom Filter of ground floor, choose the corresponding continuation address space of cryptographic hash with step (B2) gained; Step-by-step phase and computing are done in these continuation address spaces; Judge whether the bit position in this and the operation result is 0 entirely; If; Illustrate that data to be checked do not exist; Finish, otherwise get into step (B5);
(B4) for each group polling Bloom Filter of j layer; From all corresponding continuation address spaces of this group polling Bloom Filter, choose the corresponding continuation address space of cryptographic hash with step (B2) gained, step-by-step phase and computing are done in these continuation address spaces; Judge whether the bit position in each group and the operation result is 0 entirely, if, explain that data to be checked do not exist, finish, otherwise get into step (B5);
(B5) judge whether j equals number of plies N,, get into step (B7), otherwise get into step (B6) if equal;
(B7) be each bit position of 1 for every group with the operation result intermediate value, Q the Bloom Filter that chooses the corresponding j+1 layer of BloomFilter under it forms a group polling Bloom Filter, puts j=j+1, changes step (B4) over to;
(B7) the corresponding data centralization data query of Bloom Filter under each group and operation result intermediate value are 1 bit position.
CN201210202816.3A 2012-06-19 2012-06-19 Construction and query optimization methods for multiple layers of Bloom Filters Active CN102799617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210202816.3A CN102799617B (en) 2012-06-19 2012-06-19 Construction and query optimization methods for multiple layers of Bloom Filters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210202816.3A CN102799617B (en) 2012-06-19 2012-06-19 Construction and query optimization methods for multiple layers of Bloom Filters

Publications (2)

Publication Number Publication Date
CN102799617A true CN102799617A (en) 2012-11-28
CN102799617B CN102799617B (en) 2014-09-24

Family

ID=47198727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210202816.3A Active CN102799617B (en) 2012-06-19 2012-06-19 Construction and query optimization methods for multiple layers of Bloom Filters

Country Status (1)

Country Link
CN (1) CN102799617B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968467A (en) * 2012-11-10 2013-03-13 华中科技大学 Optimization method and query method for multiple layers of Bloom Filters
CN103345472A (en) * 2013-06-04 2013-10-09 北京航空航天大学 Redundancy removal file system based on limited binary tree bloom filter and construction method of redundancy removal file system
CN103902408A (en) * 2012-12-27 2014-07-02 阿普赛尔有限公司 Detecting deviation between replicas using bloom filters
CN104424256A (en) * 2013-08-28 2015-03-18 华为技术有限公司 Method and device for generating Bloom filter
CN106874458A (en) * 2017-02-14 2017-06-20 中国科学技术大学 A kind of Bloom filter building method of the multi-layered database based on layering distribution
CN111930923A (en) * 2020-07-02 2020-11-13 上海微亿智造科技有限公司 Bloom filter system and filtering method
CN113886656A (en) * 2021-10-25 2022-01-04 联想(北京)有限公司 Information query method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901248A (en) * 2010-04-07 2010-12-01 北京星网锐捷网络技术有限公司 Method and device for creating and updating Bloom filter and searching elements
CN102110171A (en) * 2011-03-22 2011-06-29 湖南大学 Method for inquiring and updating Bloom filter based on tree structure
US20110270852A1 (en) * 2010-04-28 2011-11-03 Fujitsu Limited Computer product, search apparatus, management apparatus, search method, and management method
CN102243657A (en) * 2011-07-06 2011-11-16 太原理工大学 Expandable Bloom Filter method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901248A (en) * 2010-04-07 2010-12-01 北京星网锐捷网络技术有限公司 Method and device for creating and updating Bloom filter and searching elements
US20110270852A1 (en) * 2010-04-28 2011-11-03 Fujitsu Limited Computer product, search apparatus, management apparatus, search method, and management method
CN102110171A (en) * 2011-03-22 2011-06-29 湖南大学 Method for inquiring and updating Bloom filter based on tree structure
CN102243657A (en) * 2011-07-06 2011-11-16 太原理工大学 Expandable Bloom Filter method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
袁志坚等: "典型 Bloom 过滤器的研究及其数据流应用", 《计算机工程》, vol. 35, no. 7, 30 April 2009 (2009-04-30), pages 5 - 7 *
谢鲲等: "布鲁姆过滤器查询算法", 《软件学报》, vol. 20, no. 1, 31 January 2009 (2009-01-31), pages 96 - 108 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968467A (en) * 2012-11-10 2013-03-13 华中科技大学 Optimization method and query method for multiple layers of Bloom Filters
CN103902408A (en) * 2012-12-27 2014-07-02 阿普赛尔有限公司 Detecting deviation between replicas using bloom filters
CN103902408B (en) * 2012-12-27 2018-01-26 西部数据技术公司 For detecting the method and system of the difference between reproducting content block
CN103345472A (en) * 2013-06-04 2013-10-09 北京航空航天大学 Redundancy removal file system based on limited binary tree bloom filter and construction method of redundancy removal file system
CN103345472B (en) * 2013-06-04 2016-08-10 北京航空航天大学 De-redundant file system based on limited binary tree Bloom filter and construction method thereof
CN104424256B (en) * 2013-08-28 2017-12-12 华为技术有限公司 Bloom filter generation method and device
CN104424256A (en) * 2013-08-28 2015-03-18 华为技术有限公司 Method and device for generating Bloom filter
US10664445B2 (en) 2013-08-28 2020-05-26 Huawei Technologies Co., Ltd. Bloom filter generation method and apparatus
CN106874458A (en) * 2017-02-14 2017-06-20 中国科学技术大学 A kind of Bloom filter building method of the multi-layered database based on layering distribution
CN106874458B (en) * 2017-02-14 2019-10-22 中国科学技术大学 A kind of Bloom filter building method of the multi-layered database based on layering distribution
CN111930923A (en) * 2020-07-02 2020-11-13 上海微亿智造科技有限公司 Bloom filter system and filtering method
CN111930923B (en) * 2020-07-02 2021-07-30 上海微亿智造科技有限公司 Bloom filter system and filtering method
CN113886656A (en) * 2021-10-25 2022-01-04 联想(北京)有限公司 Information query method and device and electronic equipment

Also Published As

Publication number Publication date
CN102799617B (en) 2014-09-24

Similar Documents

Publication Publication Date Title
CN102799617B (en) Construction and query optimization methods for multiple layers of Bloom Filters
CN103345472B (en) De-redundant file system based on limited binary tree Bloom filter and construction method thereof
CN102968503B (en) The data processing method of Database Systems and Database Systems
CN103226561B (en) Content addressable storage based on brother&#39;s group
CN109376156B (en) Method for reading hybrid index with storage awareness
CN101673307B (en) Space data index method and system
WO2020010502A1 (en) Distributed data redundant storage method based on consistent hash algorithm
CN102110171B (en) Method for inquiring and updating Bloom filter based on tree structure
CN103714145A (en) Relational and Key-Value type database spatial data index method
CN104809182A (en) Method for web crawler URL (uniform resource locator) deduplicating based on DSBF (dynamic splitting Bloom Filter)
CN104077423A (en) Consistent hash based structural data storage, inquiry and migration method
CN102662855B (en) Storage method and system of binary tree
CN103019953A (en) Construction system and construction method for metadata
CN107153707A (en) A kind of Hash table construction method and system for nonvolatile memory
CN105608214B (en) The method that fast search is carried out to the number-plate number of deploying to ensure effective monitoring and control of illegal activities
CN104504008B (en) A kind of Data Migration algorithm based on nested SQL to HBase
Ibrahim et al. Intelligent data placement mechanism for replicas distribution in cloud storage systems
CN102737123B (en) A kind of multidimensional data distribution method
CN104636349A (en) Method and equipment for compression and searching of index data
CN102411634A (en) Data storage method for improving real-time performance of embedded database
CN102890719B (en) A kind of method that license plate number is searched for generally and device
CN104346444A (en) Optimum site selection method based on road network reverse spatial keyword query
CN106055679A (en) Multi-level cache sensitive indexing method
CN104202428A (en) Distributed file storage system and method
CN101763390A (en) Database storing system and method based on Berkeley DB

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant