Codes for paper: Central Similarity Quantization for Efficient Image and Video Retrieval, arxiv

We release all codes and configurations for image hashing.

Update: Video hashing has been updated in here

Prerequisties

Ubuntu 16.04

NVIDIA GPU + CUDA and corresponidng Pytorch framework (v0.4.1)

Python 3.6

Datasets

Download database for the retrieval list of imagenet in the anonymous link here, and put database.txt in 'data/imagenet/'
Download MS COCO, ImageNet2012, NUS_WIDE in their official website: COCO, ImageNet, NUS_WIDE. Unzip all data and put in 'data/dataset_name/'.

Hash center (target)

Here, we put hash centers for imagenet we used in 'data/imagenet/hash_centers'. The methods to generate hash centers are given in the tutorial: Tutorial_ hash_center_generation.ipynb

Test

Pretrained models are Google Drive, or you can directly download it from the release.

It will take a long time to generate hash codes for database, because of the large-scale data size for database

Test for imagenet:

Download pre-trained model 'imagenet_64bit_0.8734_resnet50.pkl' for imagenet, put it in 'data/imagenet/', then run:

python test.py --data_name imagenet --gpus 0,1  --R 1000  --model_name 'imagenet_64bit_0.8734_resnet50.pkl'

Test for coco:

Download pre-trained model 'coco_64bit_0.8612_resnet50.pkl' for coco, put it in 'data/coco/', then run:

python test.py --data_name coco --gpus 0,1  --R 5000  --model_name 'coco_64bit_0.8612_resnet50.pkl'

Test for nus_wide:

Download pre-trained model 'nus_wide_64bit_0.8391_resnet50.pkl' for nus_wide, put it in 'data/nus_wide/', then run:

python test.py --data_name nus_wide --gpus 0,1  --R 5000  --model_name 'nus_wide_64bit_0.8391_resnet50.pkl'

The MAP of retrieval on the three datasets are shown in the following:

Dataset	MAP(16bit)	MAP(32bit)	MAP(16bit)
ImageNet	0.851	0.865	0.873
MS COCO	0.796	0.838	0.861
NUS WIDE	0.810	0.825	0.839

Train

Train on imagenet, hash bit: 64bit

Trained model will be saved in 'data/imagenet/models/'

python train.py --data_name imagenet --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05  --R 1000

Train on coco, hash bits: 64bit

Trained model will be saved in 'data/coco/models/'

python train.py --data_name coco --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05 --multi_lr 0.05  --R 5000

Train on nus_wide, hash bit: 64bit

Trained model will be saved in 'data/nus_wide/models/'

python train.py --data_name nus_wide --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05  --multi_lr 0.05 --R 5000

AlexNet as backbone.

Pretrained models of AlexNet are here. Pre-trained models for COCO will be given in the future

The MAP of retrieval on ImageNet and NUS_WIDE are shown in the following:

Dataset	MAP(16bit)	MAP(32bit)	MAP(64bit)
ImageNet	0.601	0.653	0.695
NUS_WIDE	0.744	0.785	0.789

Train on ImageNet, 16bit

python train.py --data_name imagenet --hash_bit 16 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on ImageNet, 32bit

python train.py --data_name imagenet --hash_bit 32 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on ImageNet, 64bit

python train.py --data_name imagenet --hash_bit 64 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.0001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 16bit

python train.py --data_name nus_wide --hash_bit 16 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 32bit

python train.py --data_name nus_wide --hash_bit 32 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 64bit

python train.py --data_name nus_wide --hash_bit 64 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Reference

If you find this repo useful, please consider citing:

@inproceedings{yuan2020central,
  title={Central Similarity Quantization for Efficient Image and Video Retrieval},
  author={Yuan, Li and Wang, Tao and Zhang, Xiaopeng and Tay, Francis EH and Jie, Zequn and Liu, Wei and Feng, Jiashi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3083--3092},
  year={2020}
}

yuanli2333 / Hadamard-Matrix-for-hashing