Skip to content

A simple C++ library providing a few variations of Bloom filters

License

Notifications You must be signed in to change notification settings

tmick0/bloomfilter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bloomfilter

A simple C++ library providing a few variations of Bloom filters.

Description

Three types of Bloom filters are supported:

  • Ordinary BFs
  • Counting BFs
  • Paired BFs (as seen in Mick et al.)

The following operations are supported on all types:

  • Insert
  • Query
  • Serialize
  • Deserialize

There are also a few operations supported by particular types:

  • Counting and paired BFs also support a Delete operation
  • Ordinary and paired BFs also support a Union operation
  • Counting BFs support conversion into ordinary BFs
  • Ordinary BFs support conversion into paired BFs
  • Ordinary BFs can be compressed using the halving method (as seen in Wang et al.)

A few additional helper mechanisms may eventually be implemented:

  • A helper to compute the optimal BF parameters given the number of content objects to be indexed
  • Helpers to compute the probabilities of false positives for queries on existing populated BFs

Doxygen documentation can be compiled with make docs.

Usage

Everything lives in the namespace bloom.

These classes are templated, so you must specify the type of the object you are indexing when you instantiate a Bloom filter. For example:

OrdinaryBloomFilter<std::string> bf(4, 32);

This creates an ordinary BF with 4 hashes and a 32-bit array, capable of indexing strings.

You must specialize std::hash for HashParams<T> for each type T you wish to construct a BF for. The HashParams type is a structure consisting of the template parameter T and a uint8_t; the uint8_t serves as a salt, to allow multiple hashes to be generated for a single object (as per the semantics of a BF).

A helper class implementing a 32-bit FNV-1 hash is given in FnvHash.hpp. An example of how to specialize std::hash using it can be found in tests/ordinary_insert_query.cpp.

To insert an object o into the BF, call bf.Insert(o), and to check for existence of an object, call bf.Query(o). If using a CountingBloomFilter or PairedBloomFilter, you can remove items using bf.Delete(o).

To serialize a BF into a std::ostream os, call bf.Serialize(os). To deserialize a BF from a std::istream is, use the static function Deserialize(is) within the appropriate BF class.

For information about the other operations, refer to the Doxygen documentation or read the comments in the code.

About

A simple C++ library providing a few variations of Bloom filters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published