Enet4 / faiss-rs

Rust language bindings for Faiss

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support binary indexes

aalekhpatel07 opened this issue · comments

Hi there!

Thanks for maintaining these bindings!

So, iiuc faiss has had support for binary indexes since v1.6.2 and faiss-rs is built against v1.7.2 so I naturally assumed I'd have faiss::read_index_binary, etc methods available but that doesn't seem to be the case.

I'm not very well-versed with bindings and the whole C++-Rust interop so correct me if I'm wrong but I believe we're currently just not generating bindings for the binary indexes.

If getting this to work might not be a tall order, then would you be interested in guiding me so that I can contribute?

Thanks!

Thank you for your interest in these bindings!

While this crate builds against a version of Faiss with support for binary indexes, it is worth remembering that it targets the C API exclusively. Faiss did not support binary indexes when I first worked on the C API, and this API is not updated in lockstep with the rest of the library. In other words, the work on extending faiss-rs to support binary indexes needs to start at the main Faiss repository. I happen to have found a tracking issue. Should you be interested in fulfilling this, I can provide some assistance there.

Oh, my mistake, turns out that Faiss already has its C API extended to support binary indexes. So at this end, what we need to do is:

  1. Checkout a more recent version of the Faiss library;
  2. Regenerate the low-level bindings in faiss-sys;
  3. Design a high level API here in the faiss crate.

Thanks so much for initiating this work in #77. I would love to be able to help out with this endeavour. I'll base my changes on top of change/faiss-v1.8.0, and try to design the unsafe boundaries following patterns from src/index/*.rs.

Not sure if I'm tall enough to write unsafe Rust but I'll give it a shot anyway. Your reviews and guidance will always be appreciated!

Okay! There's some careful copypasta work in #79 that covers the base BinaryIndexImpl and all the core *Index* traits and macros are replicated and adapted for BinaryIndexes. Not so sure about the naming convention we wanna go ahead with but my current assumption is to prefix the concrete impls with Binary (i.e. IndexFlat -> BinaryIndexFlat).

PS: This is my first foray into getting C++ to behave nicely with Rust so some things could be broken but I have to say this project is so well written/maintained that suprisingly, I'm having a pleasant time getting it all working together (barring my broken cublas install on Fedora 38 which forces me to generate bindings on a nvidia/cuda:12.3.2-devel-ubi8 docker container). Also, cargo build seems to be happy with these changes so maybe I'm headed in the right direction after all?

On a second thought, I'll keep the naming convention the same as the C++ api (i.e. IndexFlat -> IndexBinaryFlat)