Extending MOT Neural Solber to exploit Body joints for MOT
General Setup
For the general setup (data download, ReID weights etc.), please refer to the original MOT Neural solver (here).
Setup
Shell Environments
In the following, there will be multiple shell environments used to setup the project, each environment starts with a different character. These are explained here:
$
- only normal environment (any user) is needed#
- root environment is needed, e.g. withsudo
>
- virtual environment is needed, this can be accessed using$ pipenv shell
Virtual Environment
We have used pipenv
to setup and manage a virtual environment. To install it, make sure you have pip
or pip3
installed and type:
$ which pip3
$ pip3 install pipenv
This code has been tested with 'python3' version '3.7+'. To make sure, that python versions match use pyenv
.
$ which pyenv
If you do not have it, please install it. The installation process differentiates from OS to OS. E.g. for Arch Linux use:
# pacman -S pyenv
Now, setup the virtual environment and install its dependencies using:
$ pipenv install --python 3.7
This installs all dependencies for you except pytorch
, pytorch_lighting
and pytorch_geometry
, since these dependencies are hardware dependent and differ from machine to machine. In our case, we used cuda
version 10.1
and pytorch
version 1.7.0
.
Pytorch
To install pytorch, switch to the virtual environment using:
$ pipenv shell
In the following please match your pytorch
and cuda
versions. This example is for cuda
version 10.1
and pytorch
version 1.7.0
. change to different cuda
versions by modifiying cu101
and to different pytorch
versions by modifying torch-1.7.0
.
To install pytorch
use:
> pip install torch==1.7.0+cu101 torchvision==0.8.1+cu101 torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
To install pytorch_lightning
use:
> pip install pytorch_lightning
To install pytorch_geometry
use the following commands.
> pip install torch-scatter==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.7.0.html
> pip install torch-sparse==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.7.0.html
> pip install torch-cluster==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.7.0.html
> pip install torch-spline-conv==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.7.0.html
> pip install torch-geometric
Libraries
Since our work relies mainly on TODO:CITATION and TODO:CITATION. We use their code as libraries. Both have been added as git submodules in the ./lib
directory.
> pip install -e lib/tracking_wo_bnw
> pip install -e lib/mot_neural_solver
Reproducing Results
We made two main contributions: First, a trained baseline model, which concatenates Bounding Box joint features with the coordinates and visibility score from our joint detector to construct the graph. Our main contribution is the seperation of the two different feature sources (BBs and joints) by creating a combined graph, made up from two subgraphes (one BB and one joints graph) where each feature source is its own node. We adapted a novel message passing in between the subgraphs.
Baseline Model
To train the baseline model, simply run:
> python setup/train.py
Combined Graph Model
Similarly the combined graph model can be trained by using the configuration in configs/tracking_cfg_combined.yaml
,
eg. by chaning CONFIG_FILE = "configs/tracking_cfg_combined.yaml"
in the train.py
file.
> python setup/train.py
Combined Graph Model with joint ReID
To train with Joint ReID features, uncomment these lines in configs/tracking_cfg_combined.yaml
.
# joint_features:
# - emb
> python setup/train.py
Learning a Neural Solver for Multiple Object Tracking
This the official implementation of our CVPR 2020 (oral) paper Learning a Neural Solver for Multiple Object Tracking (Guillem Brasó, Laura Leal-Taixe)
[Paper][Youtube][CVPR Daily]
Updates
- (November 2020) Added support for MOT20 (including Tracktor object detector fine-tuning) and processing long sequences, solved issues with OOM errors.
- (June 2020) Code release.
Setup
-
Clone and enter this repository:
git clone --recursive https://github.com/dvl-tum/mot_neural_solver.git cd mot_neural_solver
-
Create an Anaconda environment for this project:
conda env create -f environment.yaml
conda activate mot_neural_solver
pip install -e tracking_wo_bnw
pip install -e .
-
(OPTIONAL) Modify the variables
DATA_PATH
, andOUTPUT_PATH
insrc/mot_neural_solver/path_cfg.py
so that they are set to your preferred locations for storing datasets and output results, respectively. By default, these paths will be in this project's root under folders nameddata
andoutput
, respectively. -
Download the MOTChallenge data by running:
bash scripts/setup/download_motcha.sh
-
Download the our reid network, Tracktor's object detector, and our trained models:
bash scripts/setup/download_models.sh
-
(OPTIONAL) For convenience, we provide the preprocessed detection files. You can download them by running:
bash scripts/setup/download_prepr_dets.sh
-
(NEW) If you are going to be working with MOT20, run the following to download the dataset, preprocessed detections, and pretrained models:
bash scripts/setup/download_mot20.sh
Running Experiments
We use Sacred to configure our experiments, and Pytorch Lightning, to structure our training code. We recommend reading these libraries' documentations for an overview.
You can configure training and evaluation experiments by modifying the options in configs/tracking_cfg.yaml
. As for
preprocessing, all available options can be found in configs/preprocessing_cfg.yaml
.
Note that you can also use Sacred's command line interface
to modify configuration entries. We show some examples in the sections below.
For every training/evaluation experiment you can specify a run_id
string. This, together with the execution
date will be used to create an identifier for the experiment being run. A folder named after this identifier, containing
model checkpoints, logs and output files will be created at $OUTPUT_PATH/experiments
(OUTPUT_PATH
is specified at src/mot_neural_solver/path_cfg.py
).
Preprocessing Detections
NOTE: You can skip this step if you will only be working with the MOT15, MO16
, MOT17
and MOT20
datasets, and run steps 6 and 7 of Setup.
As explained in the paper, we preprocess public detections by either running Tracktor (with no ReID) on them (1) or filtering false positives and refining box coordinates with a pretrained object detector (2).
On the MOT15, MO16
and MOT17
you can run the first preprocessing scheme (1) with:
python scripts/preprocess_detects.py
To run (1) on the MOT20
dataset, run instead:
python scripts/preprocess_detects.py with configs/mot20/preprocessing_cfg.yaml
If you want to use the alternative scheme (2), run the following:
python scripts/preprocess_detects.py with prepr_w_tracktor=False
All these scripts will store the preprocessed detections in the right locations within $DATA_PATH
.
If you use the second option (2), make sure to set add the named configuration configs/no_tracktor_cfg.yaml
to your training and evaluation experiments by adding with configs/no_tracktor_cfg.yaml
after your python command.
Fine-Tuning Tracktor
In order to obtain results for MOT20
, we fine-tuned Tracktor on it.
To do so, we borrowed all code from this Colab Notebook, which
was made public in Tracktor's repository, and organized it
under obj_detect
and a python script. You can reproduce the fine-tuning of the model we provide in step 7 of Setup by running:
python scripts/train_obj_detect.py
As a sanity check, we made a submission to the MOT20
test dataset with this model, and obtained the following results.
For reference, we include the comparison with the results made public by Tracktor's authors on the CVPR19 Tracking Challenge,
and our MOT Neural Solver built on top of the fine-tuned Tracktor.
Dataset | Method | MOTA | IDF1 | MT | ML |
---|---|---|---|---|---|
CVPR19 Challenge | Tracktor++ | 51.3 | 47.6 | 313 (24.9%) | 326 (26.0%) |
MOT20 | Tracktor with no ReID, fine-tuned by us | 52.1 | 44.0 | 362(29.1%) | 332 (26.7%) |
MOT20 | Ours (MPNTrack) | 57.6 | 59.1 | 474 (38.2%) | 279 (22.5%) |
Training
You can train a model by running:
python scripts/train.py
By default, sequences MOT17-04
and MOT17-11
will be used for validation, and all remaining sequences in the MOT15
and MOT17
datasets will be used for training. You can use other validation sets by
modifying the parameters data_splits.train
and data_splits.val
, or use several splits and perform cross-validation.
In order to train with all available sequences, and reproduce the training of the MOT17
model we provide, run the following:
python scripts/train.py with data_splits.train=all_train train_params.save_every_epoch=True train_params.num_epochs=6
For training a model on the MOT20
dataset, you need to use its named configuration
configs/mot20/tracking_cfg.yaml
. For instance, to reproduce the training of the MOT20
model we provide run the
following:
python scripts/train.py with configs/mot20/tracking_cfg.yaml train_params.save_every_epoch=True train_params.num_epochs=22
NOTE: The first time you use a sequence for training or testing, it will need to be processed. This means that
ground truth boxes (if available) will be assigned to detection boxes, detection files will be stored with sequence metainformation, and (possibly) reid embeddings
will be computed and stored. This process should take ~30 mins for train/test sets of MOT15
and MOT17
and only needs to be
performed once per set of detections. Computing reid embeddings in advance is optional for testing but required for
training. Doing so, speeds up training significantly and reduces substantially the training memory requirements. As explained in our
paper, we observed no significant performance boost from training CNN layers.
The reid network was trained with torchreid, by using ResNet50's
default configuration with images resized to 128 x 56, adding two fully connected layers (see resnet50_fc256
in src/mot_neural_solver/models/resnet.py
)
and training for 232 epochs. The training script will be provided in a future release.
Evaluation
You can evaluate a trained model on a set of sequences by running:
python scripts/evaluate.py
The weights used and sequences tested are determined by parameters ckpt_path
and data_splits.test
, respectively. By default, the weights from the model we provide will be used and the MOT15
and MOT17
test sequences will be evaluated. The resulting output files yield the following MOT17
metrics on the train/test set:
MOT17 | MOTA | IDF1 | FP | FN | IDs | MT | ML |
---|---|---|---|---|---|---|---|
Train | 64.4 | 70.8 | 5087 | 114460 | 504 | 636 (38.8%) | 362 (22.1%) |
Test | 58.4 | 62.1 | 17836 | 214869 | 1146 | 655 (27.8%) | 793 (33.7%) |
Note that these results show a slight difference with respect to the ones reported in the paper. Specifically, IDF1 has improved by 0.5 points, and MOTA has decreased by 0.4 points. This change is due to using a newer pytorch version and small code differences introduced while cleaning-up.
In order to evaluate a model on the MOT20
dataset, run the following:
python scripts/evaluate.py with configs/mot20/tracking_cfg.py
The resulting output files yield the following MOT20
train and test performance
MOT20 | MOTA | IDF1 | FP | FN | IDs | MT | ML |
---|---|---|---|---|---|---|---|
Train | 70.1 | 66.9 | 38260 | 299110 | 1821 | 1073 (48.4%) | 362 (10.8%) |
Test | 57.6 | 59.1 | 16953 | 201384 | 1210 | 474 (38.2%) | 279 (22.5%) |
Cross-Validation
As explained in the paper, we perform cross-validation to report the metrics of ablation experiments.
To do so, we divide MOT17
sequences in 3 sets of train/val splits. For every configuration, we then run
3 trainings, one per validation split, and report the overall metrics.
You can train and evaluate models in this manner by running:
RUN_ID=your_config_name
python scripts/train.py with run_id=$RUN_ID cross_val_split=1
python scripts/train.py with run_id=$RUN_ID cross_val_split=2
python scripts/train.py with run_id=$RUN_ID cross_val_split=3
python scripts/cross_validation.py with run_id=$RUN_ID
By setting cross_val_split
to 1, 2 or 3, the training and validation sequences corresponding
to the splits we used in the paper will be set automatically (see src/mot_neural_solver/data/splits.py
).
The last script will gather the stored metrics from each training run, and compute overall MOT17 metrics
with them.
This will be done by searching output files containing $RUN_ID
on them, so it's important that this tag is unique.
MOTA | IDF1 | FP | FN | IDs | MT | ML | |
---|---|---|---|---|---|---|---|
Cross-Val | 64.3 | 70.5 | 5610 | 114284 | 531 | 643 (39.3%) | 363 (22.2%) |
Citation
If you use our work in your research, please cite our publication:
@InProceedings{braso_2020_CVPR,
author={Guillem Brasó and Laura Leal-Taixé},
title={Learning a Neural Solver for Multiple Object Tracking},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
Please, also consider citing Tracktor if you use it for preprocessing detections:
@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}