Spatially Conditioned Graphs

Official PyTorch implementation for our paper Spatially Conditioned Graphs for Detecting Human-Object Interactions

Citation

If you find this repository useful for your research, please kindly cite our paper:

@article{zhang2020,
	author = {Frederic Z. Zhang and Dylan Campbell and Stephen Gould},
	title = {Spatially Conditioned Graphs for Detecting Human-Object Interactions},
	journal = {arXiv preprint arXiv:2012.06060},
	year = {2020}
}

Prerequisites
Data Utilities
- HICO-DET
- V-COCO
Testing
- HICO-DET
- V-COCO
Training
- HICO-DET
- V-COCO
Contact

Prerequisites

Download the repository with git clone https://github.com/fredzzhang/spatially-conditioned-graphs
Install the lightweight deep learning library Pocket
Make sure the environment you created for Pocket is activated. You are good to go!

Data Utilities

The HICO-DET and V-COCO repos have been incorporated as submodules for convenience. To download relevant data utilities, run the following commands.

cd /path/to/spatially-conditioned-graphs
git submodule init
git submodule update

HICO-DET

Download the HICO-DET dataset
1. If you have not downloaded the dataset before, run the following script
```
cd /path/to/spatially-conditioned-graphs/hicodet
bash download.sh
```
1. If you have previously downloaded the dataset, simply create a soft link
```
cd /path/to/spatially-conditioned-graphs/hicodet
ln -s /path/to/hico_20160224_det ./hico_20160224_det
```
Run a Faster R-CNN pre-trained on MS COCO to generate detections

cd /path/to/spatially-conditioned-graphs/hicodet/detections
python preprocessing.py --partition train2015
python preprocessing.py --partition test2015

Generate ground truth detections (optional)

cd /path/to/spatially-conditioned-graphs/hicodet/detections
python generate_gt_detections.py --partition test2015

Download fine-tuned detections (optional)

cd /path/to/spatially-conditioned-graphs/download
bash download_finetuned_detections.sh

To attempt fine-tuning yourself, refer to the instructions in the HICO-DET repository. The checkpoint of our fine-tuned detector can be found here.

V-COCO

Download the train2014 and val2014 partitions of the COCO dataset
1. If you have not downloaded the dataset before, run the following script
```
cd /path/to/spatially-conditioned-graphs/vcoco
bash download.sh
```
1. If you have previsouly downloaded the dataset, simply create a soft link. Note that
```
cd /path/to/spatially-conditioned-graphs/vcoco
ln -s /path/to/coco ./mscoco2014
```
Run a Faster R-CNN pre-trained on MS COCO to generate detections

cd /path/to/spatially-conditioned-graphs/vcoco/detections
python preprocessing.py --partition trainval
python preprocessing.py --partition test

Testing

HICO-DET

Download the checkpoint of our trained model

cd /path/to/spatially-conditioned-graphs/download
bash download_checkpoint.sh

Test a model

cd /path/to/spatially-conditioned-graphs
CUDA_VISIBLE_DEVICES=0 python test.py --model-path checkpoints/scg_1e-4_b32h16e7_hicodet_e2e.pt

By default, detections from a pre-trained detector is used. To change sources of detections, use the argument --detection-dir, e.g. --detection-dir hicodet/detections/test2015_gt to select ground truth detections. Fine-tuned detections (if you downloaded them) are available under hicodet/detections.

Cache detections for Matlab evaluation following HO-RCNN (optional)

cd /path/to/spatially-conditioned-graphs
CUDA_VISIBLE_DEVICES=0 python cache.py --model-path checkpoints/scg_1e-4_b32h16e7_hicodet_e2e.pt

By default, 80 .mat files, one for each object class, will be cached in a directory named matlab. Use the --cache-dir argument to change the cache directory. To change sources of detections, refer to the use of --detection-dir in the previous section.

As a reference, the performance of the provided model is shown in the table below

Detections	Default Setting	Known Object Setting
Pre-trained on MS COCO	(`21.85`, `18.11`, `22.97`)	(`25.53`, `21.79`, `26.64`)
Fine-tuned on HICO-DET (DRG)	(`31.33`, `24.72`, `33.31`)	(`34.37`, `27.18`, `36.52`)
Ground truth detections	(`51.53`, `41.02`, `54.67`)	(`51.75`, `41.40`, `54.84`)

V-COCO

We did not implement evaluation utilities for V-COCO, and instead use the utilities provided by Gupta. To generate the required pickle file, run the following script by correctly specifying the path to a model with --model-path

cd /path/to/spatially-conditioned-graphs
CUDA_VISIBLE_DEVICES=0 python cache.py --dataset vcoco --data-root vcoco \
    --detection-dir vcoco/detections/test \
    --cache-dir vcoco_cache --partition test \
    --model-path /path/to/a/model

This will generate a file named vcoco_results.pkl under vcoco_cache in the current directory. Please refer to the v-coco repo (not to be confused with vcoco, the submodule) for further instructions. Note that loading the pickle file requires a particular class CacheTemplate, which is shown below in its entirety.

from collections import defaultdict
class CacheTemplate(defaultdict):
    """A template for VCOCO cached results """
    def __init__(self, **kwargs):
        super().__init__()
        for k, v in kwargs.items():
            self[k] = v
    def __missing__(self, k):
        seg = k.split('_')
        # Assign zero score to missing actions
        if seg[-1] == 'agent':
            return 0.
        # Assign zero score and a tiny box to missing <action,role> pairs
        else:
            return [0., 0., .1, .1, 0.]

You can either add it into the evaluation code or save it as a seperate file to import from.

Training

HICO-DET

cd /path/to/spatially-conditioned-graphs
python main.py --world-size 8 &>log &

Specify the number of GPUs to use with the argument --world-size. The default sub-batch size is 4 (per GPU). The provided model was trained with 8 GPUs, with an effective batch size of 32. Reducing the effective batch size could result in slightly inferior performance. The default learning rate for batch size of 32 is 0.0001. As a rule of thumb, scale the learning rate proportionally when changing the batch size, e.g. 0.00005 for batch size of 16. It is recommended to redirect stdout and stderr to a file to save the training log (as indicated by &>log). To check the progress, run cat log | grep mAP, or alternatively you can go through the log with vim log. Also, the mAP logged follows a slightly different protocol. It does NOT necessarily correlate with the mAP that the community reports. It only serves as a diagnostic tool. The true performance of the model requires running a seperate test as shown in the previous section. By default, checkpoints will be saved under checkpoints in the current directory. For more arguments, run python main.py --help to find out. We follow the early stopping training strategy, and have concluded (using a validation set split from the training set) that the model at epoch 7 should be picked. Training on 8 GeForce GTX TITAN X devices takes about 5 hours.

V-COCO

Code for V-COCO is being cleaned up at the moment. Instructions will be released soon.

Contact

If you have any questions regarding our paper or the repo, please post them in discussions. If you ran into issues related to the code, feel free to open an issue. Alternatively, you can contact me at frederic.zhang@anu.edu.au

nikanor97 / spatially-conditioned-graphs

Spatially Conditioned Graphs

Citation

Table of Contents

Prerequisites

Data Utilities

HICO-DET

V-COCO

Testing

HICO-DET

V-COCO

Training

HICO-DET

V-COCO

Contact

About

Languages