niais / mv-ignet

The Official PyTorch implementation of "Learning Multi-View Interactional Skeleton Graph for Action Recognition" in TPAMI 2020

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MV-IGNet

The Official PyTorch implementation of "Learning Multi-View Interactional Skeleton Graph for Action Recognition" [IEEEXplore] in TPAMI 2020. The arXiv version of our paper is coming soon.

Contents

  1. Current Status
  2. Overview and Advantages
  3. Requirements
  4. Installation
  5. Data Preparation
  6. Training
  7. Evaluation
  8. Results
  9. Citation
  10. Acknowledgement

Current Status

  • NTU-RGB+D
    • Data Preparation
    • Models
      • SPGNet
      • HPGNet
      • MV-HPGNet
  • NTU-RGB+D 120

Overview and Advantages

  • Lighter Network with Higher Accuracy

    • Smaller Model
      • You only need 2.5 M to save our model
    • Faster Inference (on single NVidia 2080 Ti)
      • 8 s inference time on Cross-View validation set of NTU-RGB+D
      • 1min 20s training time per epoch on Cross-View training set of NTU-RGB+D
    • Higher Accuracy
  • Efficient Unit: SPGConv for Richer Context Modeling

    • The key code of SPGConv
      # set graph
      dw_gcn_weight = self.dw_gcn_weight.mul(self.A)
      # depth-wise conv
      x = torch.einsum('nctv,cvw->nctw', (x, dw_gcn_weight))
      # point-wise conv
      x = torch.einsum('nctw,cd->ndtw', (x, self.pw_gcn_weight))
    • Illustration
  • Unified Framework: Easy to Implement


Requirements

We only test our code on the following environment:

  • Python == 3.7
  • PyTorch == 1.2.0 (Our code runs slow when PyTorch >=1.4.0)
  • CUDA == 10.0 or 10.1

Installation

# Install python environment
$ conda create -n mvignet python=3.7
$ conda activate mvignet

# Install Pytorch 1.2.0 with CUDA 10.0 or 10.1
$ pip install torch==1.2.0 torchvision==0.4.0

# Download our code
$ git clone https://github.com/niais/mv-ignet
$ cd mv-ignet

# Install torchlight
$ cd torchlight; python setup.py install; cd ..

# Install other python libraries
$ pip install -r requirements.txt

Data Preparation

  • NTU RGB+D: only the 3D skeleton (5.8GB) modality is required in our experiments. You can put the raw data in the directory <path to nturgbd+d_skeletons> and build the database as:

    # generate raw database
    $ python tools/ntu_gendata.py --data_path <path to nturgbd+d_skeletons>
    
    # process the above raw data for our method
    $ python feeder/preprocess_ntu.py

Training

  • Example for training MV-HPGNet on ntu-xview. You can train other models by using .yaml files at config/ folder.

    # train hpgnet with physical graph
    $ python main.py rec_stream --config config/mv-ignet/ntu-xview/train_hpgnet_simple.yaml --device 0 1
    # train hpgnet with complement graph
    $ python main.py rec_stream --config config/mv-ignet/ntu-xview/train_hpgnet-complement_simple.yaml --device 2 3
  • About multi-view training: you need train two models as above, with a skeleton graph and its complement graph respectively. We save the complement graph in complement_graph_1.npz for convenient and you can compute it yourself:

    • how to use complement_graph_1.npz:
      import numpy as np
      saved_graph = np.load('complement_graph_1.npz')
      # 'cA' is the adjacent matrix and 'norm_cA' is its normalization
      cA, norm_cA = saved_graph['a'], saved_graph['na']
    • how to compute it yourself:
      # given the adjacent matrix A, its complement cA can be computed by:
      cA = 1.0 - A
  • Trained Models: we have put our checkpoints on NTU-RGB+D dataset at weights/ folder:

    # checkpoints on NTU-RGB+D dataset
    weights
      ├── xsub
      │    ├── xsub_HPGNet_epoch120_model.pt
      │    └── xsub_HPGNet-complement_epoch120_model.pt
      └── xview
           ├── xview_HPGNet_epoch120_model.pt
           └── xview_HPGNet-complement_epoch120_model.pt

Evaluation

  • Example for single model evaluation (HPGNet model):
    # evaluate hpgnet model with physical graph
    $ python main.py rec_stream --phase test --config config/mv-ignet/ntu-xview/train_hpgnet_simple.yaml --weights <path to weights>
    # evaluate hpgnet model with complement graph
    $ python main.py rec_stream --phase test --config config/mv-ignet/ntu-xview/train_hpgnet-complement_simple.yaml --weights <path to weights>
  • Example for multi model evaluation (MV-HPGNet model):
    # we provide 'eval_ensemble.sh' file to do this simply
    $ python main.py rec_ensemble \
             --config config/mv-ignet/ntu-xview/test_hpgnet_ensemble_simple.yaml \
             --weights <path to model-1 weights> \
             --weights2 <path to model-2 weights>
    Note that before evaluating multi-view model, you must have trained two models with the skeleton graph and its complement graph respectively in Training.

Results

The expected Top-1 accuracy results on NTU-RGB+D 60 dataset are shown here:

Model Cross View (%) Cross Subject (%)
ST-GCN 88.8 81.6
SPGNet 94.3 86.8
HPGNet 94.7 87.2
MV-HPGNet 95.8 88.6

Citation

Please cite our paper if you find this repo useful in your resesarch:

@article{wang2020learning,
  title={Learning Multi-View Interactional Skeleton Graph for Action Recognition},
  author={Wang, Minsi and Ni, Bingbing and Yang, Xiaokang},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2020},
  publisher={IEEE}
}

Acknowledgement

The framework of current code is based on the old version of ST-GCN (Its new version is MMSkeleton).

About

The Official PyTorch implementation of "Learning Multi-View Interactional Skeleton Graph for Action Recognition" in TPAMI 2020

License:BSD 2-Clause "Simplified" License


Languages

Language:Python 99.5%Language:Shell 0.5%