Event-AHU / CeleX-HAR

Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms


🎯 Abstract

Human Action Recognition (HAR) stands as a pivotal research domain in both computer vision and artificial intelligence, with RGB cameras dominating as the preferred tool for investigation and innovation in this field. However, in real-world applications, RGB cameras encounter numerous challenges, including light conditions, fast motion, and privacy concerns. Consequently, bio-inspired event cameras have garnered increasing attention due to their advantages of low energy consumption, high dynamic range, etc. Nevertheless, most existing event-based HAR datasets are low resolution ($346 \times 260$). In this paper, we propose a large-scale, high-definition ($1280 \times 800$) human action recognition dataset based on the CeleX-V event camera, termed CeleX-HAR. It encompasses 150 commonly occurring action categories, comprising a total of 124,625 video sequences. Various factors such as multi-view, illumination, action speed, and occlusion are considered when recording these data. To build a more comprehensive benchmark dataset, we report over 20 mainstream HAR models for future works to compare. In addition, we also propose a novel Mamba vision backbone network for event stream based HAR, termed EVMamba, which equips the spatial plane multi-directional scanning and novel voxel temporal scanning mechanism. By encoding and mining the spatio-temporal information of event streams, our EVMamba has achieved favorable results across multiple datasets. Both the dataset and source code will be released upon acceptance.

πŸ’₯ Update Log

πŸ“€ Demo Video

A demo video can be found by clicking the image below:

DemoVideo

πŸ”¨ Environment

A Spatial-Temporal Scanning framework for Event Stream-based Human Action Recognition.

Install env

conda create -n evmamba python=3.10.13
conda activate evmamba
pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Please refer to Vmamba to install the following packages:

cd kernels/selective_scan && pip install .

Install the required packages in mmaction

pip install -U openmim
mim install mmengine
pip install mmcv==2.0.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.0/index.html
pip install -v -e .

Download pre-trained vssmbase_dp06_ckpt_epoch_241 and put it under $/pretrained_models

Then, put the HAR dataset Celex-HAR in ./data

You can modify the paths of pretrained_model and dataset by editing

EVMamba\mmaction\models\backbones\classification\config_b.py
EVMamba\configs\recognition\EVMamba\CeleX-HAR.py

πŸƒ Train & Test

# train & test
bash train.sh
bash dist_train.sh  (For Distributed Training)

πŸ“– Download the CeleX-HAR dataset

  • BaiduYun:

πŸ’Ύ Baidu Netdisk link: https://pan.baidu.com/s/1yvJje7DqIn8qT9fmQMbeeQ?pwd=wsad code:wsad

The directory should have the below format:

β”œβ”€β”€ CeleX-HAR (124,625 videos (training subset: 99,642 videos;  testing subset: 24,983 videos;))
    β”œβ”€β”€ CeleX_HR (377.32GB)
        β”œβ”€β”€ rawframes
            β”œβ”€β”€ action_001_pull_up_weeds
                β”œβ”€β”€ action_001_20220221_110904108_EI_70M
                    β”œβ”€β”€ 0000.png
                    β”œβ”€β”€ 0001.png
                    β”œβ”€β”€ 0002.png
                    β”œβ”€β”€ ...
                β”œβ”€β”€ action_001_20220221_110910254_EI_70M
                β”œβ”€β”€ ...
            β”œβ”€β”€ action_002_take_somebody's_pulse
            β”œβ”€β”€ action_003_move_the_chair
            β”œβ”€β”€ ...
    β”œβ”€β”€ celex_voxel (15.55GB)
        β”œβ”€β”€ action_001_pull_up_weeds
            β”œβ”€β”€ action_001_20220221_110904108_EI_70M.mat
            β”œβ”€β”€ action_001_20220221_110910254_EI_70M.mat
            β”œβ”€β”€ ...
        β”œβ”€β”€ action_002_take_somebody's_pulse
        β”œβ”€β”€ action_003_move_the_chair
        β”œβ”€β”€ ...

πŸ’• Citation

If you have any questions about this work, please leave an issue. Also, please give us a star if you think this paper helps your research.

@article{wang2024celexhar,
  title={Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms},
  author={Wang, Xiao and Wang, Shiao and Shao, Pengpeng and Jiang, Bo and Zhu, Lin and Tian, Yonghong},
  journal={arXiv preprint arXiv:2408.09764},
  year={2024}
}

About

Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms


Languages

Language:Python 96.8%Language:Shell 3.2%