Get a CUDA error when searching in a Faiss index having too small vectors
hayj opened this issue · comments
Describe the bug
Get a CUDA error when searching in a Faiss index having too small vectors
Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::ivfInterleavedScanImpl_32_(faiss::gpu::Tensor<float, 2, true>&, faiss::gpu::Tensor<long int, 2, true>&, faiss::gpu::DeviceVector<void*>&, faiss::gpu::DeviceVector<void*>&, faiss::gpu::IndicesOptions, faiss::gpu::DeviceVector<int>&, int, faiss::MetricType, bool, faiss::gpu::Tensor<float, 3, true>&, faiss::gpu::GpuScalarQuantizer*, faiss::gpu::Tensor<float, 2, true>&, faiss::gpu::Tensor<long int, 2, true>&, faiss::gpu::GpuResources*) at /project/faiss/faiss/gpu/impl/scan/IVFInterleaved32.cu:13; details: CUDA error 9 invalid configuration argument
Aborted (core dumped)
or
Faiss assertion 'err__ == cudaSuccess' failed in int faiss::gpu::getNumDevices() at /project/faiss/faiss/gpu/utils/DeviceUtils.cu:36; details: CUDA error 401 the operation cannot be performed in the present state
To Reproduce
import faiss
import numpy as np
nb_vectors = 80_000
dim = 128
k = 5
nlist = 10
nprobe = 2
vectors = np.random.rand(nb_vectors, dim)
index = faiss.index_factory(dim, "IVF16384,Flat")
index.nlist = nlist
index.nprobe = nprobe
options = faiss.GpuMultipleClonerOptions()
options.shard = True
options.common_ivf_quantizer = True
index = faiss.index_cpu_to_all_gpus(index, options)
index.train(vectors)
index.add(vectors)
results = index.search(vectors, k)
This code works with dim = 1024
.
I also tried with different indexes and different parameters (nlist, etc.) but it always fails for a certain vector size (and not when increasing the size).
When I try to install different versions of Faiss (nightly and old version) I face incompatibility issues such as:
AttributeError: module 'faiss._swigfaiss' has no attribute 'delete_ParameterRangeVector'
or
TypeError: in method 'GpuIndexIVFFlat_train', argument 3 of type 'float const *'
Desktop:
- OS: Linux Ubuntu 20.04.6 LTS
- GPU: Tesla V100 / Tesla A100
- Architecture: x86_64
- Python: 3.9.17
- Version: v1.7.3 (faiss-gpu @ https://github.com/kyamagu/faiss-wheels/releases/download/v1.7.3/faiss_gpu-1.7.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=1a9755132bb81dc3daecd16e0b5471ddf0246e555a889ea311813f867fdcca88)
Solved it, pls refer to facebookresearch/faiss#3062