xinyandai / similarity-search

A framework for index based similarity search.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

similarity-search

License: MIT Build Status

A general framework for similarity search.

To reproduce result of our NIPS paper, please refer MIPS.

make all

git clone https://github.com/xinyandai/similarity-search.git
cd similarity-search/src
mkdir build
cd build
cmake ..
make all -j

Run PQ-IMI

./pq \
    -t ../../data/audio/audio_base.fvecs  \
    -b ../../data/audio/audio_base.fvecs  \
    -q ../../data/audio/audio_query.fvecs \
    -g ../../data/audio/20_audio_euclid_groundtruth.lshbox \
    --iteration 20 \
    --kmeans_centers 20 \
    --num_codebook 3

Run Sign Random Projection

./srp \
    -t ../../data/audio/audio_base.fvecs  \
    -b ../../data/audio/audio_base.fvecs  \
    -q ../../data/audio/audio_query.fvecs \
    -g ../../data/audio/20_angular_euclid_groundtruth.lshbox \
    --num_bit 32

Run Cross Polytope LSH

./cross_polytope \
    -t ../../data/audio/audio_base.fvecs  \
    -b ../../data/audio/audio_base.fvecs  \
    -q ../../data/audio/audio_query.fvecs \
    -g ../../data/audio/20_angular_euclid_groundtruth.lshbox \
    --kmeans_centers 32\
    --num_bit  1
# kmeans_centers represent d' and num_bit represents the number of hash tables

Run Iterative Quantization

./itq \
    -t ../../data/audio/audio_base.fvecs  \
    -b ../../data/audio/audio_base.fvecs  \
    -q ../../data/audio/audio_query.fvecs \
    -g ../../data/audio/20_audio_euclid_groundtruth.lshbox \
    --num_bit 32

Run E2LSH

./e2lsh \
    -t ../../data/audio/audio_base.fvecs  \
    -b ../../data/audio/audio_base.fvecs  \
    -q ../../data/audio/audio_query.fvecs \
    -g ../../data/audio/20_audio_euclid_groundtruth.lshbox \
    --num_bit 32

Reference

PQ based method for gradient quantization

PQ based method for similarity search

Norm-Ranging LSH for Maximum Inner Product Search

Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)

About

A framework for index based similarity search.

License:MIT License


Languages

Language:C++ 92.7%Language:Shell 6.6%Language:CMake 0.7%