DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic Model

Environment

This code is built on the following environment

You can create and activate conda environment as the following:

conda env create -f requirements.yaml
conda activate diffupose

You can set up Human3.6M and HumanEva-I datasets following the link in the Videopose3D.

Alternatively, you can directly download data from Google Drive link.

You can download them and unzip the .zip file in the ./data folder.

Make sure that your repository end up with './data/data_3d_h36m.npz', './data/data_2d_h36m_hr.npz', './data/data_2d_h36m_gt.npz'.

If you want to train our model from scratch using HR-Net detection, please run

python run.py -k hr -b 1024

Else, if you want to train with 2D ground-truth, please run

python run.py -k gt -b 1024

We provide our pre-trained 384-dimension model in results folder (HR-Net detected 2D pose as input). To evaluate our model, please run

python run.py -k hr --test-load best_model.pt

which will result in 50.0 mm error (MPJPE).

To obatin best result with 10 samples, run

python run.py -k hr --test-load best_model.pt --num-sample 10

which will result in 49.4 mm error (MPJPE).

Our code is compatiable with VideoPose3D. Please refer to their github page for detailed instruction.

Official implementation of the IROS2023 paper "DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic Model"

Language:HTML 72.0%Language:Python 28.0%