google / guava

Google core libraries for Java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make BloomFilter.bitSize() public

MartinHaeusler opened this issue · comments

1. What are you trying to do?

I am using Guava's bloom filters as part of a persistent file format, i.e. the raw byte array of the bloom filter lives somewhere in the file. It would be beneficial to the efficiency to know the length of the byte array produced by the bloom filter beforehand (i.e. without actually serializing it).

There is already a method called bitSize in the BloomFilter, but unfortunately it is not public. The method also doesn't include the two bytes from the strategy and the number of hash functions, as well as the integer for the length of the bits.data array.

2. What's the best code you can write to accomplish that without the new feature?

public int getByteSizeOf(BloomFilter<*> bloomFilter) {
    return serialize(bloomFilter).length;
}

public byte[] serialize(BloomFilter<*> bloomFilter){
    try(var baos = new ByteArrayOutputStream()) {
        bloomFilter.writeTo(baos);
        return baos.toByteArray();
    }
}

The method getSizeOf is very inefficient because it actually serializes the bloom filter to get its size. It would be nice if we could do it without the serialization.

3. What would that same code look like if we added your feature?

BloomFilter<*> bloom = ...;
var size = bloom.getSizeInBytes();

(Optional) What would the method signatures for your feature look like?

public class BloomFilter<T> {

    public int getSizeInBytes();

}

Concrete Use Cases

Serialization of the bloom filter as a building block for more complex formats.

Packages

com.google.common.hash

Checklist