dami23 / WREG_Self_KD

PyTorch code for ['Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation']. This paper is submitted to IROS 2023.

Preliminary

Please refer to MattNet to install mask-faster-rcnn, REFER and refer-parser2. Follow Step 1 & 2 in Training to prepare the data and features.
Please follow the step in DTWREG to acquire the parsed discriminative triads.

The experiments are conducted on one GPU (NVIDIA RTX A6000).

python == 3.7.13
pytorch == 1.10

Feature Encoding

follow the feature extraction in MattNet
extract ann_pool5 and ann_fc7 feats using py27 + pytorch 0.4.1

CUDA_VISIBLE_DEVICES={GPU_ID} python ./tools/extract_mrcn_ann_fc7_feats.py --dataset {DATASET} --splitBy {SPLITBY}

Training and evaluation

training

CUDA_VISIBLE_DEVICES={GPU_ID} python ./tools/train_skd.py --dataset {DATASET} --splitBy {SPLITBY} --exp_id {EXP_ID}
evaluation

CUDA_VISIBLE_DEVICES={GPU_ID} python ./tools/eval.py --dataset {DATASET} --splitBy {SPLITBY} --split {SPLIT} --id {EXP_ID}

{DATASET} = refcoco, refcoco+, refcocog. {SPLITBY} = unc for refcoco and refcoco+, google for refcocog.

The acquired results with different settings are listed in output/easy_results.txt

Pretrained Models

All trained models by the proposed approach can be downloaded here.

Acknowledgement

The code is based on DTWREG.

About

MIT License

Languages

Language:Python 100.0%