training detail about IVFPQ with GPU
Hardcandies opened this issue · comments
Summary
According to https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU#using-multiple-gpus, I know that faiss will replicate dataset to all GPU by default. But I want to know something about IVFPQ training time:
- When copying data to all GPUs, which GPU will Faiss use for training? If all GPU data will be trained, which index will be used in the end?
- When the data is divided equally among different GPUs, does clustering only happen on part of the data on the GPU?And again, which index will be used in the end?
Platform
OS: linux
Faiss version: faiss-1.6.2
Running on:
- GPU
Interface:
- Python
On multiple GPUs, the indexes will become independent so the training will be duplicated.
There are ways to avoid that, but the default behavior is that.
So when the data is divided among different GPUs, each index will only be trained on the part of all data?
Nope, all indexes get the same training data. Since training is reproducible, they will get the same training results.
So when I use mode "IndexShards", all indexes still get the same training data?