RoaringBitmap / CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

Home Page:http://roaringbitmap.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

roaring64: or_inplace gives incorrect values

blazer2x opened this issue · comments

You'll need to include xxhash header, adding it so it can generate some distributed 64bit values.
Basically the code should create 16x 64bit maps and fill them by multiplying an index against a loop, it then merges all the 16 maps using the inplace OR.
The same code to generate the values are then used to check whether they exist in the merged map, we expect there to be 100% hit rate within the bitmap but it does not appear to be the case when a bitmask is used.
If the mask is removed then there is 100% hit rate, however by changing the mask say to 0xFFFFFFFFF and 0xFFFFFFFFFF you will see varying hit rates which is incorrect.

#include <stdio.h>
#include <stdlib.h>

#define XXH_INLINE_ALL
#include "roaring.h"
#include "xxhash.h"

#define MASK 0xFFFFFFFF
roaring64_bitmap_t **roaring64_map;

int main()
{
    roaring64_map = (roaring64_bitmap_t **)malloc((16) * sizeof(roaring64_bitmap_t *));

    for (int i = 0; i< 16; i++)
    {
        roaring64_map[i] = roaring64_bitmap_create();
    }

    uint64_t values[16];

    char sBuffer[BUFSIZ];
    uint64_t hashed = 0;


    for(int i = 0; i<16; i++)
    {
        for (long z = 0; z<214748; z++)
        {
            sprintf(sBuffer,"%ld\n",z*(i+1));
            hashed = XXH3_64bits(sBuffer,strlen(sBuffer));
            hashed = MASK & hashed;
            roaring64_bitmap_add(roaring64_map[i],hashed);

            if (roaring64_bitmap_contains(roaring64_map[i],hashed) == 0)
                fprintf(stderr,"Oops not in the map");
        }
        fprintf(stderr,"Cardinality for map %d:%zu\n",i,roaring64_bitmap_get_cardinality(roaring64_map[i]));
    }

    for (int i = 1; i< 16; i++)
    {
        roaring64_bitmap_or_inplace(roaring64_map[0],roaring64_map[i]);
        roaring64_bitmap_free(roaring64_map[i]);
    }

    fprintf(stderr,"Total Cardinality for merged map %zu\n",roaring64_bitmap_get_cardinality(roaring64_map[0]));
    size_t counter = 0;
    size_t total = 0;

    for(int i = 0; i<16; i++)
    {
        for (long z = 0; z<214748; z++)
        {
            sprintf(sBuffer,"%ld\n",z*(i+1));
            hashed = XXH3_64bits(sBuffer,strlen(sBuffer));
            hashed = MASK & hashed;
            if (roaring64_bitmap_contains(roaring64_map[0],hashed) == 0)
                counter++;
            total++;
        }
    }
    fprintf(stderr,"Expected bitmap hits %zu, Actual bitmap hits %zu\n",total,total-counter);
}


@blazer2x would you mind checking if #563 fixes this issue?

@SLieve #563 resolves the issue and the above test runs correctly, I have tried adjusting the mask and am getting the expected hits within the merged bitmap. Thank you for the fix, I'll keep testing the ART64 implementation and see if I see anything else.