Invalid sets and segfaults after add_offset
SamHames opened this issue · comments
I ran into a couple of issues with shift operations in pyroaring, which might be a bug in croaring? Or we're holding it wrong :)
The first issue seems to somehow be related to serialising/deserialising a bitmap after shifting, and results in very weird behaviour - the resulting bitmap seems to have the correct cardinality, but is definitely no longer a valid set. I think the Python corresponds to the following C, based on the v2.1.0 amalgamated release.
#include <stdio.h>
#include <stdlib.h>
#include "roaring.c"
int main() {
roaring_bitmap_t *dense = roaring_bitmap_create();
// Make a bitmap with enough entries to need a bitset container
for (int k = 0; k < 4500; ++k) {
roaring_bitmap_add(dense, 2 * k);
}
// Shift it to partly overlap with the next container.
roaring_bitmap_t *dense_shift = roaring_bitmap_add_offset(dense, 64000);
// Serialise and deserialise
int buffer_size = roaring_bitmap_portable_size_in_bytes(dense_shift);
char *arr = NULL;
arr = malloc(buffer_size * sizeof(char));
roaring_bitmap_portable_serialize(dense_shift, arr);
roaring_bitmap_t *deserialized = roaring_bitmap_portable_deserialize(arr);
// Iterate through the deserialised bitmap - This should be the same set as before
// just shifted...
roaring_uint32_iterator_t *iterator = roaring_create_iterator(deserialized);
while (iterator->has_value)
{
printf("value = %d\n", (int) iterator->current_value);
roaring_advance_uint32_iterator(iterator);
}
roaring_free_uint32_iterator(iterator);
printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(dense_shift));
printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(deserialized));
roaring_bitmap_free(dense);
roaring_bitmap_free(dense_shift);
roaring_bitmap_free(deserialized);
}
For the second example the segfault happens in the call to roaring_bitmap_equals
, but as far as I can tell only happens with copy_on_write on.
Please update to the latest version of CRoaring. It should fix the issue. Thanks for reporting it.
Just confirming that the first issue with serialisation is fixed with 2.1.1, but the second persists and might be a different issue (was hoping they'd be the same) - I'm not sure if the add_offset
is actually necessary but I can only find failing examples in pyroaring with specific data/shift amount combinations.
Here's the most minimal example of a segfault I could make (on v2.1.1) by transcribing the specific calls pyroaring is making. The segfault specifically requires copy_on_write
to be on and the roaring_bitmap_copy
operation has to happen before the add_offset
operation.
#include <stdio.h>
#include <stdlib.h>
#include "roaring.c"
int main() {
// bm_before = cls(values, copy_on_write=cow)
int shift = -65536;
roaring_bitmap_t *toshift = roaring_bitmap_from_range(131074, 131876, 1);
roaring_bitmap_set_copy_on_write(toshift, 1);
// bm_copy = cls(bm_before)
// This copy on a copy_on_write bitmap is necessary *before* the shift for the
// segfault to happen
roaring_bitmap_t *toshift_copy = roaring_bitmap_copy(toshift);
// bm_after = bm_before.shift(offset)
roaring_bitmap_t *shifted = roaring_bitmap_add_offset(toshift, shift);
// self.assertEqual(bm_before, bm_copy)
roaring_bitmap_equals(toshift, toshift_copy);
// expected = cls([val+offset for val in values if val+offset in range(0, 2**32)], copy_on_write=cow)
roaring_bitmap_t *expected = roaring_bitmap_from_range(131074 + shift, 131876 + shift, 1);
roaring_bitmap_set_copy_on_write(expected, 1);
printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(toshift));
printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(expected));
// self.assertEqual(bm_after, expected)
roaring_bitmap_equals(shifted, expected);
// Also segfaults
printf("cardinality = %d\n", (int) roaring_bitmap_get_cardinality(shifted));
roaring_bitmap_free(toshift);
roaring_bitmap_free(shifted);
roaring_bitmap_free(expected);
}
Can reproduce. Adding a roaring_bitmap_internal_validate(shifted, &reason)
call before the segfault outputs a container is NULL
reason, and running under lldb shows that shifted has null at containers[0], but a size of 1 (and a typecode of 4 - SHARED_CONTAINER_TYPE
)
Yes. I have the fix. Will release soon.
Please update : https://github.com/RoaringBitmap/CRoaring/releases/tag/v2.1.2
I introduced a leak in the last release.
Ah. No. The leak is in @SamHames's code. Ok.
See #541
Can confirm that's all working now, thanks a lot!