Introduction

This is the official implementation for BMVC 2020 oral paper Weakly supervised cross-domain alignment with optimal transport.

Code largely borrowed from code.

Requirements and Installation

We recommended the following dependencies.

Python 3.6
PyTorch 0.4.0
NumPy (>1.12.1)
TensorBoard
Punkt Sentence Tokenizer:

import nltk
nltk.download()
> d punkt

nltk stopword package

Download data

The precomputed image features of MS-COCO are from here. The precomputed image features of Flickr30K are extracted from the raw Flickr30K images using the bottom-up attention model from here. All the data needed for reproducing the experiments in the paper, including image features and vocabularies, can be downloaded from:

wget https://scanproject.blob.core.windows.net/scan-data/data.zip
wget https://scanproject.blob.core.windows.net/scan-data/vocab.zip

We refer to the path of extracted files for data.zip as $DATA_PATH and files for vocab.zip to ./vocab directory. Alternatively, you can also run vocab.py to produce vocabulary files. For example,

python vocab.py --data_path data --data_name f30k_precomp
python vocab.py --data_path data --data_name coco_precomp

Training new models

Run train_dot_OT.py:

python train_OT.py --data_path "$DATA_PATH" --data_name f30k_precomp --vocab_path "$VOCAB_PATH" --logger_name runs/coco_scan/log --model_name runs/OT/log --max_violation --bi_gru --margin=0.12 --alpha=1.5 --data_type=full --learning_rate=0.0002 --num_epochs=30 --lr_update=15
python train_OT.py --data_path "$DATA_PATH" --data_name coco_precomp --vocab_path "$VOCAB_PATH" --logger_name runs/coco_scan/log --model_name runs/OT/log --max_violation --bi_gru --margin=0.05 --alpha=0.1 --data_type=full

Evaluate trained models

from vocab import Vocabulary
import evaluation
evaluation.evalrank_dot_OT("$RUN_PATH/coco_dot/model_best.pth.tar", data_path="$DATA_PATH", split="test")

To do cross-validation on MSCOCO, pass fold5=True with a model trained using --data_name coco_precomp.

Reference

Please consider citing our paper if you refer to this code in your research.

@article{yuan2020weakly,
  title={Weakly supervised cross-domain alignment with optimal transport},
  author={Yuan, Siyang and Bai, Ke and Chen, Liqun and Zhang, Yizhe and Tao, Chenyang and Li, Chunyuan and Wang, Guoyin and Henao, Ricardo and Carin, Lawrence},
  journal={arXiv preprint arXiv:2008.06597},
  year={2020}
}

About

Apache License 2.0

Languages

Language:Jupyter Notebook 63.0%Language:C++ 24.9%Language:Python 7.8%Language:Cuda 2.1%Language:CMake 0.9%Language:Shell 0.4%Language:MATLAB 0.3%Language:Makefile 0.2%Language:Cython 0.2%Language:C 0.1%Language:CSS 0.1%Language:HTML 0.1%Language:Dockerfile 0.0%