RoaringBitmap / CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

Home Page:http://roaringbitmap.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Invalid sets and segfaults after add_offset

SamHames opened this issue · comments

I ran into a couple of issues with shift operations in pyroaring, which might be a bug in croaring? Or we're holding it wrong :)

The first issue seems to somehow be related to serialising/deserialising a bitmap after shifting, and results in very weird behaviour - the resulting bitmap seems to have the correct cardinality, but is definitely no longer a valid set. I think the Python corresponds to the following C, based on the v2.1.0 amalgamated release.

#include <stdio.h>
#include <stdlib.h>
#include "roaring.c"
int main() {

  roaring_bitmap_t *dense = roaring_bitmap_create();

  // Make a bitmap with enough entries to need a bitset container
  for (int k = 0; k < 4500; ++k) {
        roaring_bitmap_add(dense, 2 * k);
  }

  // Shift it to partly overlap with the next container.
  roaring_bitmap_t *dense_shift = roaring_bitmap_add_offset(dense, 64000);

  // Serialise and deserialise
  int buffer_size = roaring_bitmap_portable_size_in_bytes(dense_shift);

  char *arr = NULL;
  arr = malloc(buffer_size * sizeof(char));

  roaring_bitmap_portable_serialize(dense_shift, arr);

  roaring_bitmap_t *deserialized = roaring_bitmap_portable_deserialize(arr);

  // Iterate through the deserialised bitmap - This should be the same set as before
  // just shifted...
  roaring_uint32_iterator_t *iterator = roaring_create_iterator(deserialized);

  while (iterator->has_value)
  {
      printf("value = %d\n", (int) iterator->current_value);
      roaring_advance_uint32_iterator(iterator);
  }

  roaring_free_uint32_iterator(iterator);

  printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(dense_shift));
  printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(deserialized));

  roaring_bitmap_free(dense);
  roaring_bitmap_free(dense_shift);
  roaring_bitmap_free(deserialized);

}

For the second example the segfault happens in the call to roaring_bitmap_equals, but as far as I can tell only happens with copy_on_write on.

Please update to the latest version of CRoaring. It should fix the issue. Thanks for reporting it.

Just confirming that the first issue with serialisation is fixed with 2.1.1, but the second persists and might be a different issue (was hoping they'd be the same) - I'm not sure if the add_offset is actually necessary but I can only find failing examples in pyroaring with specific data/shift amount combinations.

Here's the most minimal example of a segfault I could make (on v2.1.1) by transcribing the specific calls pyroaring is making. The segfault specifically requires copy_on_write to be on and the roaring_bitmap_copy operation has to happen before the add_offset operation.

#include <stdio.h>
#include <stdlib.h>
#include "roaring.c"
int main() {

  // bm_before = cls(values, copy_on_write=cow)
  int shift = -65536;

  roaring_bitmap_t *toshift = roaring_bitmap_from_range(131074, 131876, 1);
  roaring_bitmap_set_copy_on_write(toshift, 1);

  // bm_copy = cls(bm_before)
  
  // This copy on a copy_on_write bitmap is necessary *before* the shift for the
  // segfault to happen
  roaring_bitmap_t *toshift_copy = roaring_bitmap_copy(toshift);

  // bm_after = bm_before.shift(offset)
  roaring_bitmap_t *shifted = roaring_bitmap_add_offset(toshift, shift);

  // self.assertEqual(bm_before, bm_copy)
  roaring_bitmap_equals(toshift, toshift_copy);

  // expected = cls([val+offset for val in values if val+offset in range(0, 2**32)], copy_on_write=cow)
  roaring_bitmap_t *expected = roaring_bitmap_from_range(131074 + shift, 131876 + shift, 1);
  roaring_bitmap_set_copy_on_write(expected, 1);

  printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(toshift));
  printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(expected));

  // self.assertEqual(bm_after, expected)
  roaring_bitmap_equals(shifted, expected);

  // Also segfaults
  printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(shifted));

  roaring_bitmap_free(toshift);
  roaring_bitmap_free(shifted);
  roaring_bitmap_free(expected);
}

Can reproduce. Adding a roaring_bitmap_internal_validate(shifted, &reason) call before the segfault outputs a container is NULL reason, and running under lldb shows that shifted has null at containers[0], but a size of 1 (and a typecode of 4 - SHARED_CONTAINER_TYPE)

Yes. I have the fix. Will release soon.

I introduced a leak in the last release.

Ah. No. The leak is in @SamHames's code. Ok.

Can confirm that's all working now, thanks a lot!