radglob / pyblume

A fast and scalable Bloom filter implementation for Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pyblume

pyblume is a fast, scalable bloom filter implementation for Python.

Installation

The easiest way to use the extension is to install it from PyPI using pip:

$ pip install pyblume

You can also install the extension directly from source.

$ python setup.py install

Use

This example shows how to create a bloom filter using pyblume, and check for a match.

import pyblume

BLOOM_FILTER_MAX_FILESIZE = 1024 * 1024 * 500
BLOOM_ERROR_RATE = 0.000001
fileloc = "/tmp/test.blume"

bf = pyblume.Filter(BLOOM_FILTER_MAX_FILESIZE, BLOOM_ERROR_RATE, fileloc)
bf.add("Terbium")
bf.close()

check = ['Baltimore', 'Terbium', 'Labs']

bf = pyblume.open(fileloc, for_write=False)
for x in check:
	if x in bf:
		print "Found"
bf.close()

Contributions

Special thanks to Austin Appleby for placing MurmurHash3 into the public domain.

Pull requests are most welcome.

About

A fast and scalable Bloom filter implementation for Python

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:C 86.8%Language:Python 12.8%Language:Makefile 0.4%