skarupke / flat_hash_map

A very fast hashtable

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Corner case issue when hash collision with bad hashing

Philippe91 opened this issue · comments

Each time there is a hash collision, the set doubles its size (same problem with the map)
See the simple sample code below.
note: no such problem with ska::unordered_set

void main()
{
	struct hash_t
	{
		size_t operator()(int) const { return 1; };
	};
	ska::flat_hash_set<int, hash_t> set;

	for (int i = 0; i < 30; ++i)
	{
		std::cout << set.bucket_count() << std::endl;
		set.insert(i);
	}
}

Printout:
4
4
8
8
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728

After a new look at this issue, I realize that my previous test code is really hitting a corner case that does not happen in real life, provided the hash function is not terrible.
This being said, we can considerably reduce the growth rate for such corner cases by changing the method 'compute_max_lookups'
Just multiply 'desired' by 2, like this:

int8_t desired = detailv3::log2(num_buckets) * 2;