Motion Fused Frames (MFFs)

Pytorch implementation of the article Motion fused frames: Data level fusion strategy for hand gesture recognition

- Update: Code is updated for Pytorch 1.5.0 and CUDA 10.2

Installation

Clone the repo with the following command:

git clone https://github.com/okankop/MFF-pytorch.git

Setup in virtual environment and install the requirements:

conda create -n MFF python=3.7.4
pip install -r requirements.txt

Dataset Preparation

Download the jester dataset or NVIDIA dynamic hand gestures dataset or ChaLearn LAP IsoGD dataset. Decompress them into the same folder and use process_dataset.py to generate the index files for train, val, and test split. Poperly set up the train, validatin, and category meta files in datasets_video.py. Finally, use directory flow_computation to calculate the optical flow images using Brox method.

将数据集下载到一个文件夹后，数据集处理分三步：

用process_dataset.py生成训练、验证和测试的索引文件
用flow_computation计算光流

Assume the structure of data directories is the following (处理好的文件目录):

~/MFF-pytorch/
   datasets/
      jester/
         rgb/
            .../ (directories of video samples)
                .../ (jpg color frames)
         flow/
            u/
               .../ (directories of video samples)
                  .../ (jpg optical-flow-u frames)
            v/
               .../ (directories of video samples)
                  .../ (jpg optical-flow-v frames)
    model/
       .../(saved models for the last checkpoint and best model)

Running the Code

Followings are some examples for training under different scenarios:

Train 4-segment network with 3 flow, 1 color frames (4-MFFs-3f1c architecture) （重新训练）

python main_old.py jester RGBFlow --arch BNInception --num_segments 4 --consensus_type MLP --num_motion 3  --batch-size 32

Train resuming the last checkpoint (4-MFFs-3f1c architecture) （从检查点【之前训练结果】恢复训练）

python main_old.py jester RGBFlow --resume=<path-to-last-checkpoint> --arch BNInception --consensus_type MLP --num_segments 4 --num_motion 3  --batch-size 32

The command to test trained models (4-MFFs-3f1c architecture). Pretrained models are under pretrained_models. （测试模型）

python test_models.py jester RGBFlow pretrained_models/MFF_jester_RGBFlow_BNInception_segment4_3f1c_best.pth.tar --arch BNInception --consensus_type MLP --test_crops 1 --num_motion 3 --test_segments 4

All GPUs are used for the training. If you want a part of GPUs, use CUDA_VISIBLE_DEVICES=...

Citation

If you use this code or pre-trained models, please cite the following:

@InProceedings{Kopuklu_2018_CVPR_Workshops,
author = {Kopuklu, Okan and Kose, Neslihan and Rigoll, Gerhard},
title = {Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture Recognition},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}

Acknowledgement

This project is built on top of the codebase TSN-pytorch. We thank Yuanjun Xiong for releasing TSN-Pytorch codebase, which we build our work on top. We also thank Bolei Zhou for the insprational work Temporal Segment Networks, from which we imported process_dataset.py to our project.

About

基于手势识别的人机交互系统。通过MFF运动融合帧深度学习方法与传统视觉手势识别算法相结合，实现较为准确的手势识别效果，进而实现非接触式的体感交互。

mff gesture deep-learning

Other

Languages

Language:Python 100.0%