panicked at 'too large index value provided to Idx::new'
nemosupremo opened this issue · comments
When you use a large integer (greater than 2^63) as an ID, faiss-rs will panic. This is a bit unexpected as it's not mentioned anywhere in the documentation, nor do I think faiss has this limitation. If the intent is to represent null ids, isn't Option<faiss::idx_t>
a better choice?
For what it's worth, this behavior is documented. Idx::new
was designed to ensure that the returned index intendes to refer to an item on index, rather than null/void.
Panic
Panics if the ID is too large (>= 2^63)
One would use Idx::none
to create a void index (internally represented by -1
).
What one can do to improve this is:
- Extend the documentation of these functions to further clarify that
Idx::new
really expects a non-negative idx. - Provide a
new_unchecked
variant which admits any idx value.
For what it's worth, this behavior is documented
My mistake I missed that
Idx::new was designed to ensure that the returned index intendes to refer to an item on index, rather than null/void.
I'm new to faiss so I could be misunderstanding something. Is it the case that faiss requires Idx to be non-negative? My use case is I'm adding vectors with add_with_ids
, and the ids I'm using are completely arbitrary and random; so when trying to add a vector with a given id I can generate a negative idx. The thing is the "void index" thing seems to be faiss-rs thing, as I can't find any indication that you can't use a negative idx with faiss. If faiss allows you to use negative indexes then isn't an Option
better here?
Just catching up with older issues.
Is it the case that faiss requires Idx to be non-negative?
That much is the assumption made in this API, but it does not go far off from what is expected to users of the native Faiss API. An index entry ID of -1
in a search result means an empty/missing entry, which can happen in some index implementations.
My use case is I'm adding vectors with add_with_ids, and the ids I'm using are completely arbitrary and random; so when trying to add a vector with a given id I can generate a negative idx. The thing is the "void index" thing seems to be faiss-rs thing, as I can't find any indication that you can't use a negative idx with faiss.
As explained above, this concept of empty ID is indeed in the native library. Even if this API were to be more relaxed to allow negative numbers, it should not allow an ID value of -1
because that collides with the semantic of a missing vector on the search result list. You would not be able to distinguish a search result item from being empty or from being the vector of ID -1
. Sounds like a very nasty source of bugs to me.
All of this can be circumvented by converting an i64
into an Idx
via From<i64>
, but for this to be sound, you would need to adjust the ID generation logic in any case, so as to avoid generating -1
.
If faiss allows you to use negative indexes then isn't an Option better here?
It would be an interesting design option, but without an integer type that is not allowed to be -1
, that would bloat the size of the index type, making it terribly inefficient. Right now, idx_t
values are mapped 1:1 to Idx
without representation changes. Forcing us to output a vector of Option<Idx>
as the result would require a copy onto a representation that can hold the 64-bit integer plus the Option
discriminant.
An index entry ID of -1 in a search result means an empty/missing entry, which can happen in some index implementations.
I missed this; this would complicate things