ashrielbrian / vers

In-memory vector database in Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

vers

Lightweight, simple, single instance, local in-memory vector database written in Rust.

Currently supports the following indexing strategies:

  1. IVFFlat (k-means for partitioning)
  2. Locality-sensitive hashing (LSH) heavily inspired by fennel.ai's blog post.

Getting Started

Like any sensible package, the API aims to be dead simple.

  1. Import, obviously:
    use vers::indexes::base::{Index, Vector};
    use vers::indexes::ivfflat::IVFFlatIndex;
  1. Build an index:
    let mut index = IVFFlatIndex::build_index(
        num_clusters,
        num_attempts,
        max_iterations,
        &vectors
    );
  1. Add an embedding vector into the index:
    index.add(Vector(*emb), emb_unique_id);
  1. Persist the index to disk:
    let _ = index.save_index("wiki.index");
  1. Load the index from disk:
    let index = match IVFFlatIndex::load_index("wiki.index") {
        Ok(index) => index,
        Err(e) => panic!("Failed to load index! {}", e),
    };
  1. And of course, actually search the index:
    let results = index.search_approximate(
        embs.get("king"),   // query vector
        10                  // top_k
    ); // kings, queen, monarch, ...

That said, the API is unstable and subject to change. In particular, I really dislike having to pass in the unique vector ID into search_approximate.

Coming soon

  1. Python bindings
  2. Performance improvements (building IVFFlat index is slow, vectorization)
  3. Benchmarks (comparisons with popular ANN search indexes, e.g. faiss, and exhaustive searches)

Contributions are welcomed.

About

In-memory vector database in Rust

License:MIT License


Languages

Language:Rust 100.0%