prestodb / airlift

Airlift framework for building REST services

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Off-by-one bug in calculation of zeros in SparseHll::toDense

mbasmanova opened this issue · comments

I noticed a bug in SparseHll::toDense while porting HLL algorithm to C++.

https://github.com/airlift/airlift/blob/master/stats/src/main/java/io/airlift/stats/cardinality/SparseHll.java#L175

Here, decodeBucketValue has the number of zeros + 1, hence, the +1 in listener.visit(bucket, zeros + 1); will add an extra one.

See airlift#926

CC: @arhimondr @rongrong @tdcmeehan @highker

Do we understand the implications of this in correctness in production?

Per the unit tests in #55, it seems the correct values are being recorded.