Bloom filter implementation for BIN-702
> clone the repository
> run the main.py to start the benchmarks
python 3.X
mmh3 lib -> pip install mmh3
> cd src/
> python main.py
Set |
mb |
peak mb |
insert ms |
clear ms |
find ms |
1000 |
0.000296 |
0.041276 |
0.9889 |
0 |
0 |
10000 |
0.000296 |
0.655676 |
1.9998 |
0 |
0 |
100000 |
0.000324 |
6.291772 |
19.9799 |
1.9973 |
8.9914 |
2000000 |
0.000324 |
100.663612 |
457.532 |
50.9484 |
203.3379 |
Bloom |
mb |
peak mb |
insert ms |
clear ms |
find ms |
error prob |
hash count |
bit count |
error count |
1000 |
0.010876 |
0.011449 |
15.9934 |
0 |
15.9833 |
0.001 |
7 |
9586 |
1 |
10000 |
0.0993 |
0.099879 |
179.8307 |
1.9978 |
155.8389 |
0.0021 |
7 |
95851 |
21 |
100000 |
1.04424 |
1.044825 |
1634.4944 |
44.9578 |
1571.1436 |
0.00154 |
7 |
958506 |
154 |
2000000 |
19.83744 |
19.838031 |
35077.5659 |
1017.1684 |
35077.5659 |
0.0016655 |
7 |
19170117 |
3331 |
Bloom |
mb |
peak mb |
insert ms |
clear ms |
find ms |
error prob |
hash count |
bit count |
error count |
1000 |
0.000296 |
0.041276 |
11.9879 |
0 |
12.9987 |
0.003 |
5 |
9586 |
3 |
10000 |
0.0993 |
0.099882 |
170.8237 |
1.9972 |
126.8831 |
0.0037 |
5 |
95851 |
37 |
100000 |
1.04424 |
1.044825 |
1246.7373 |
48.9547 |
1311.182 |
0.00221 |
5 |
958506 |
221 |
2000000 |
19.83744 |
19.838034 |
25255.1331 |
1037.1826 |
23715.8606 |
0.002236 |
5 |
19170117 |
4472 |
Bloom |
mb |
peak mb |
insert ms |
clear ms |
find ms |
error prob |
hash count |
bit count |
error count |
1000 |
0.00398 |
0.004556 |
12.9867 |
0.9983 |
15.9839 |
0.108 |
5 |
2949 |
108 |
10000 |
0.030796 |
0.031378 |
122.8838 |
0.9991 |
115.8721 |
0.1099 |
5 |
29492 |
1099 |
100000 |
0.321784 |
0.322372 |
1205.3608 |
11.9547 |
1442.8068 |
0.10759 |
5 |
294924 |
10759 |
2000000 |
6.10908 |
6.109674 |
24957.1912 |
330.6803 |
24417.6577 |
0.1064285 |
5 |
5898497 |
212857 |
Possible Applications in Bioinformatics
- Sequence characterization
- Genome assembly
- Sequencing error correction
- RNA-Seq
Tristan Deschamps