Exploiting Spatio-Temporal Representation for 3D Human Action Recognition from Depth Map Sequences
This repository is an implementation of "Exploiting Spatio-Temporal Representation for3D Human Action Recognition from Depth Map Sequences".
Installation
- Clone the repository
https://github.com/xp-ji/DOGV-ST3D
- Install python dependencies
- Create conda environment with dependencies:
conda env create -f environment.yml
- Activate the environment:
conda activate dogv
Quick Start
- Download required dataset, for example NTU_RGB+D from the ROSE webpage. Your dataset structure should look like
ntu_rgb+d/
depth/
...
S001C003P008R001A045/
Depth-00000001.png
...
Depth-00000054.png
...
depth_masked/
...
S001C003P008R001A045/
MDepth-00000001.png
...
MDepth-00000054.png
...
- Download the model weights from Google Drive and put the
*.pth
files inweights
. - Run the command
python test_model.py --arch Resnet3d --dataset NTU_RGBD --intype DOGV --exp X-Sub --weights "weights/NTU_RGBD_Resnet3d_X-Sub_DOGV.pth"
The test result will be saved in logs/Test_NTU_RGBD_Resnet3d_X-Sub_DOGV
.
Re-implementation
Prepare datasets
Download all needed dataset, i.e., NTU-RGB+D, PKU-MMD and UOW_combined3D
Training the model from scratch
python train_model.py --dataset NTU_RGBD --arch Resnet3d --intype RDMs --exp X-Sub
Citation
If you find this code useful for your research, please use the following BibTeX entry.
@article{ji2021exploiting,
title={Exploiting spatio-temporal representation for 3D human action recognition from depth map sequences},
author={Ji, Xiaopeng and Zhao, Qingsong and Cheng, Jun and Ma, Chenfei},
journal={Knowledge-Based Systems},
pages={107040},
year={2021},
publisher={Elsevier}
}