andreyp / libbloom

Yet another bloom filter realization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

=== LIBBLOOM

by Andrey Mukhin <a.mukhin77@gmail.com>

== What is libbloom?
libbloom is a simple classes set that provides bloom filter technology.
The Bloom filter is a space-efficient probabilistic data structure that is used
to test whether an element is a member of a set (see http://en.wikipedia.org/wiki/Bloom_filter).

== How it works
First of all you must define main parameters:
  - type of elements (must be C++ POD type)
  - type of return-value of the hash function
  - false positives
  - max number of elements
  - hash function which you want to use
  - initialization vector for the hash function
  - and maybe number of hash functions

You must set all of those parameters into the Bloom class constructor.
For example:
  Bloom<unsigned long, unsigned> b(1000000, 1., 0, &bloom::FNVHash);
  - type of elements -- unsigned long
  - type of return-value of the hash function -- unsigned
  - false positives -- 1.(percent)
  - max number of elements -- 1000000
  - hash function wich you want to use -- bloom::FNVHash
  - initialization vector for the hash function -- 0

So, when we have constructed Bloom object, we can get such values as:
  - min number of bits that our bloom filter needs -- b.getBitsNumber()
  - min number of bytes -- b.getBytesNumber()

Now we can create bloom filter storage, for example:
  unsigned char bloom_storage[b.getBytesNumber()];

Before starting to work with the filter we must specify a storage for it:
  b.setBitStorage(bloom_storage, b.getBytesNumber());

And in the end we can put any elements into the storage.
  unsigned long e = 128500;
  bool f_new_element = b.fillBitSet(e);

When this work is done we can specify a new storage, e.g.:
  b.setBitStorage(....);

Thus libbloom provides only bloom filter algorithm and doesn't include hash functions or storages.
Therefore we can use different combinations of functions and storages we need.

About

Yet another bloom filter realization


Languages

Language:C++ 97.8%Language:Shell 2.2%