Frank-ZY-Dou / TORE

TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer

Home Page:https://frank-zy-dou.github.io/projects/Tore/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer

[Project Page][Paper][Code]

This is the official PyTorch implementation of TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer (ICCV 2023).

Installation

Follow FASTMETRO installation (CUDA 10.1). CUDA 11.1 is not currently supported, due to OpenDR not supporting it.

We recommend create a new conda environment for this project.

# Create a conda environment, activate the environment and install PyTorch via conda
conda create --name tore python=3.8
conda activate tore
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

# Install OpenDR
pip install git+https://gitlab.eecs.umich.edu/ngv-python-modules/opendr.git

# Install FastMETRO
git clone --recursive https://github.com/Frank-ZY-Dou/TORE.git
cd TORE
python setup.py build develop

# Install requirements
pip install -r requirements.txt

# Install manopth
pip install ./manopth/.

We also provide a docker container for easy environment installation.

Beyond setting up the environment, our repository needs additional files to work.

Please download the models folder from this link and set up TORE/models according to the following file tree:

models
├── efficientnet
│   └── efficientnet-b0-355c32eb.pth
├── hrnet
│   ├── cls_hrnet_w64_sgd_lr5e-2_wd1e-4_bs32_x100.yaml
│   └── hrnetv2_w64_imagenet_pretrained.pth
└── resnet
    └── resnet50-0676ba61.pth

Please also download the data folder from this link and set up TORE/tore/modeling/data according to the following file tree:

tore/modeling/data
├── J_regressor_extra.npy
├── J_regressor_h36m_correct.npy
├── MANO_RIGHT.pkl
├── README.md
├── basicModel_neutral_lbs_10_207_0_v1.0.0.pkl
├── config.py
├── mano_195_adjmat_indices.pt
├── mano_195_adjmat_size.pt
├── mano_195_adjmat_values.pt
├── mano_downsampling.npz
├── mesh_downsampling.npz
├── smpl_431_adjmat_indices.pt
├── smpl_431_adjmat_size.pt
├── smpl_431_adjmat_values.pt
└── smpl_431_faces.npy

Checkpoints

We provide various pre-trained checkpoints for inference and fine-tuning.

Human3.6M

Name PA-MPJPE GFLOPs Link
FastMETRO + HRNet-w64 + TORE (@20%) 36.4 30.2 Google Drive
FastMETRO + ResNet50 + TORE (@20%) 40.5 5.4 Google Drive
FastMETRO + EfficientNet-b0 + TORE (@20%) 43.9 1.7 Google Drive
METRO + HRNet-w64 + TORE 37.1 30.2 Google Drive
METRO + ResNet50 + TORE 42.0 5.4 Google Drive

3DPW

Name PA-MPJPE GFLOPs Link
FastMETRO + HRNet-w64 + TORE (@20%) 44.4 30.2 Google Drive

Inference

Use the following shell command for inference.

python ./tore/tools/tore_inference_fm.py \
       --resume_checkpoint [your_checkpoint.bin] \
       --image_file_or_path [image folder or image file]

A template is provided in inference_tore_fm.sh. We recommend using the FastMETRO + HRNet-w64 + TORE (@20%) checkpoint, due to its strong generalizing ability on in-the-wild images.

Experiments

To train the TORE model, we need to download additional datasets. Please follow Part 5 in DOWNLOAD.md of METRO to download the datasets.

Then, use the following shell command for training TORE with FastMETRO:

python setup.py build develop
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python -m torch.distributed.launch --nproc_per_node=4 --master_port=44444 \
       tore/tools/run_tore_fm_bodymesh.py \
       --train_yaml your_dataset_folder/Tax-H36m-coco40k-Muco-UP-Mpii/train.yaml \
       --val_yaml your_dataset_folder/human3.6m/valid.protocol2.yaml \
       --num_workers 4 \
       --per_gpu_train_batch_size 16 \
       --per_gpu_eval_batch_size 16 \
       --lr 1e-4 \
       --arch efficientnet-b0 \
       --num_train_epochs 60 \
       --output_dir your_output_folder \
       --keep_ratio 0.8 \
       --model_name 'FastMETRO_L' \
       --itp_loss_weight 1e-3 \
       --edge_and_normal_vector_loss "false"

An example is provided in train_tore_fm.sh.

Use the following shell command for training TORE with METRO:

python setup.py build develop
python -m torch.distributed.launch --nproc_per_node=8 \
       tore/tools/run_tore_m_bodymesh.py \
       --train_yaml your_dataset_folder/Tax-H36m-coco40k-Muco-UP-Mpii/train.yaml \
       --val_yaml your_dataset_folder/human3.6m/valid.protocol2.yaml \
       --arch resnet50 \
       --num_workers 4 \
       --per_gpu_train_batch_size 32 \
       --per_gpu_eval_batch_size 32 \
       --num_hidden_layers 4 \
       --num_attention_heads 4 \
       --lr 1e-4 \
       --num_train_epochs 200 \
       --input_feat_dim 2051,512,128 \
       --hidden_feat_dim 1024,256,128 \
       --output_dir your_output_folder

Use --arch=hrnet-w64 for HRNet-W64 backbone, --arch=resnet50 for ResNet50 backbone, and --arch=efficientnet-b0 for EfficientNet-b0 backbone.

Contributing

Please note that enhancing mesh quality can be achieved by applying a SMPL parameter regressor.

We welcome contributions and suggestions.

Citations

If you find our work useful in your research, please consider citing:

@InProceedings{Dou_2023_ICCV,
    author    = {Dou, Zhiyang and Wu, Qingxuan and Lin, Cheng and Cao, Zeyu and Wu, Qiangqiang and Wan, Weilin and Komura, Taku and Wang, Wenping},
    title     = {TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {15143-15155}
}

License

Our research code is released under the MIT license.

We use submodules from third parties, such as huggingface/transformers and hassony2/manopth. Please see NOTICE for details.

We note that any use of SMPL models and MANO models are subject to Software Copyright License for non-commercial scientific research purposes. See SMPL-Model License and MANO License for details.

Acknowledgments

Our implementation and experiments are built on top of open-source GitHub repositories. We thank all the authors who made their code public, which tremendously accelerates our project progress. If you find these works helpful, please consider citing them as well.

huggingface/transformers

HRNet/HRNet-Image-Classification

nkolot/GraphCMR

akanazawa/hmr

MandyMo/pytorch_HMR

hassony2/manopth

hongsukchoi/Pose2Mesh_RELEASE

mks0601/I2L-MeshNet_RELEASE

open-mmlab/mmdetection

microsoft/MeshTransformer

postech-ami/FastMETRO

lukemelas/EfficientNet-PyTorch

About

TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer

https://frank-zy-dou.github.io/projects/Tore/index.html

License:MIT License


Languages

Language:Python 86.3%Language:Jupyter Notebook 13.4%Language:CSS 0.1%Language:Shell 0.1%Language:JavaScript 0.1%Language:Makefile 0.0%Language:Dockerfile 0.0%