LSQ++ in 16x4 (nbits=4) - Does NOT scale up to large training sets

Question

LSQ++ in 16x4 (nbits=4) - Does NOT scale up to large training sets

k-amara opened this issue 3 years ago · comments

Hello,

I have run LSQ++ for codebook of size M=16 (number of subspaces) and codes encoded in nbits=4 for BigANN1M and Deep1M. When increasing the size of the training set, I observe a drop in recall (@1, @10, @100) for both datasets. Please find attached graphics that illustrate the problem.

I have used for LSQ++ the FAISS implementation (faiss.LocalSearchQuantizer(d, M, nbits)). @mdouze

Have you experienced this issue when testing LSQ++16x4?
I did a gridsearch on niter_train and niter_ils_train but have observed no difference in the drop...

Cheers
@k-amara

Julieta Martinez · Answer 1 · Fri Aug 13 2021 06:14:29 GMT+0800 (China Standard Time)

Hello @k-amara,

This seems very strange. I remember trying out larger training sets during my PhD, and I did not observe drops in recall -- and definitely not dramatic ones like the ones you've shared.

Does this happen with the implementation in Rayuela too? If it doesn't, then it's probably a bug in the FAISS implementation.

Cheers,