[Project] [Paper] [YouTube] [Bilibili] [Poster] [Supp]
Joint Discriminative and Generative Learning for Person Re-identification, CVPR 2019 (Oral)
Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, Jan Kautz
- 08/24/2019: We add the direct transfer learning results of DG-Net here.
- 08/01/2019: We add the support of multi-GPU training:
python train.py --config configs/latest.yaml --gpu_ids 0,1
.
We have supported:
- Multi-GPU training (fp32)
- APEX to save GPU memory (fp16/fp32)
- Multi-query evaluation
- Random erasing
- Visualize training curves
- Generate all figures in the paper
- Python 3.6
- GPU memory >= 15G (fp32)
- GPU memory >= 10G (fp16/fp32)
- NumPy
- PyTorch 1.0+
- [Optional] APEX (fp16/fp32)
- Install PyTorch
- Install torchvision from the source:
git clone https://github.com/pytorch/vision
cd vision
python setup.py install
- [Optional] You may skip it. Install APEX from the source:
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext
- Clone this repo:
git clone https://github.com/NVlabs/DG-Net.git
cd DG-Net/
Our code is tested on PyTorch 1.0.0+ and torchvision 0.2.1+ .
Download the dataset Market-1501 [Google Drive] [Baidu Disk]
Preparation: put the images with the same id in one folder. You may use
python prepare-market.py # for Market-1501
Note to modify the dataset path to your own path.
We provide our trained model. You may download it from Google Drive (or Baidu Disk password: rqvf). You may download and move it to the outputs
.
├── outputs/
│ ├── E0.5new_reid0.5_w30000
├── models
│ ├── best/
- Supervised learning
Market-1501 | DukeMTMC-reID | MSMT17 | CUHK03-NP | |
---|---|---|---|---|
Rank@1 | 94.8% | 86.6% | 77.2% | 65.6% |
mAP | 86.0% | 74.8% | 52.3% | 61.1% |
- Direct transfer learning
To verify the generalizability of DG-Net, we train the model on dataset A and directly test the model on dataset B (with no adaptation). We denote the direct transfer learning protocol asA→B
.
Market→Duke | Duke→Market | Market→MSMT | MSMT→Market | Duke→MSMT | MSMT→Duke | |
---|---|---|---|---|---|---|
Rank@1 | 42.62% | 56.12% | 17.11% | 61.76% | 20.59% | 61.89% |
Rank@5 | 58.57% | 72.18% | 26.66% | 77.67% | 31.67% | 75.81% |
Rank@10 | 64.63% | 78.12% | 31.62% | 83.25% | 37.04% | 80.34% |
mAP | 24.25% | 26.83% | 5.41% | 33.62% | 6.35% | 40.69% |
Please check the README.md
in the ./visual_tools
.
You may use the ./visual_tools/test_folder.py
to generate lots of images and then do the evaluation. The only thing you need to modify is the data path in SSIM and FID.
You may directly download our trained teacher model from Google Drive (or Baidu Disk password: rqvf).
If you want to have it trained by yourself, please check the person re-id baseline repository to train a teacher model, then copy and put it in the ./models
.
├── models/
│ ├── best/ /* teacher model for Market-1501
│ ├── net_last.pth /* model file
│ ├── ...
-
Setup the yaml file. Check out
configs/latest.yaml
. Change the data_root field to the path of your prepared folder-based dataset, e.g.../Market-1501/pytorch
. -
Start training
python train.py --config configs/latest.yaml
Or train with low precision (fp16)
python train.py --config configs/latest-fp16.yaml
Intermediate image outputs and model binary files are saved in outputs/latest
.
- Check the loss log
tensorboard --logdir logs/latest
We provide our generated images and make a large-scale synthetic dataset called DG-Market. This dataset is generated by our DG-Net and consists of 128,307 images (613MB), about 10 times larger than the training set of original Market-1501 (even much more can be generated with DG-Net). It can be used as a source of unlabeled training dataset for semi-supervised learning. You may download the dataset from Google Drive (or Baidu Disk password: qxyh).
DG-Market | Market-1501 (training) | |
---|---|---|
#identity | - | 751 |
#images | 128,307 | 12,936 |
Note the format of camera id and number of cameras. For some datasets (e.g., MSMT17), there are more than 10 cameras. You need to modify the preparation and evaluation code to read the double-digit camera id. For some vehicle re-id datasets (e.g., VeRi) having different naming rules, you also need to modify the preparation and evaluation code.
Please cite this paper if it helps your research:
@inproceedings{zheng2019joint,
title={Joint discriminative and generative learning for person re-identification},
author={Zheng, Zhedong and Yang, Xiaodong and Yu, Zhiding and Zheng, Liang and Yang, Yi and Kautz, Jan},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2019}
}
Other GAN-based methods compared in the paper include LSGAN, FDGAN and PG2GAN. We forked the code and made some changes for evaluatation, thank the authors for their great work. We would also like to thank to the great projects in person re-id baseline, MUNIT and DRIT.
Copyright (C) 2019 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International). The code is released for academic research use only. For commercial use, please contact researchinquiries@nvidia.com.