Important!

Warning

This repo main branch are under development so it will have much bugs, because it doesn't test completely!

Note

If you want to reproduce paper, please checkout branch to svtas-paper!

Paper List

Streaming Video Temporal Action Segmentation In Real Time, paper, statu: under review

Abstract

Temporal action segmentation (TAS) is a critical step toward long-term video understanding. Recent studies follow a pattern that builds models based on features instead of raw video picture information. However, we claim those models are trained complicatedly and limit application scenarios. It is hard for them to segment human actions of video in real time because they must work after the full video features are extracted. As the real-time action segmentation task is different from TAS task, we define it as streaming video real-time temporal action segmentation (SVTAS) task.

Framework Feature

Envirnment Prepare

Linux Ubuntu 20.04+
Python 3.8+
PyTorch 1.11+
CUDA 11.3+
Cudnn 8.2+
Pillow-SIMD (optional): Install it by the following scripts.

conda uninstall -y --force pillow pil jpeg libtiff libjpeg-turbo
pip   uninstall -y         pillow pil jpeg libtiff libjpeg-turbo
conda install -yc conda-forge libjpeg-turbo
CFLAGS="${CFLAGS} -mavx2" pip install --upgrade --no-cache-dir --force-reinstall --no-binary :all: --compile pillow-simd
conda install -y jpeg libtiff

use pip to install environment

conda create -n torch python=3.8
python -m pip install --upgrade pip
pip install -r requirements.txt

# export
pip freeze > requirements.txt

If report correlation_cuda package no found, you should read Install

Prepare Data

Read Doc Prepare Datset

Usage

Read Doc Usage

Citation

@misc{2209.13808,
Author = {Wujun Wen and Yunheng Li and Zhuben Dong and Lin Feng and Wanxiao Yang and Shenlan Liu},
Title = {Streaming Video Temporal Action Segmentation In Real Time},
Year = {2022},
Eprint = {arXiv:2209.13808},
}

Acknowledgement

This repo borrowed code from many great open source libraries, thanks again for their selfless dedication.

About

End to End Streaming Video Temporal Segmentation

Apache License 2.0

Languages

Language:Python 97.6%Language:Cuda 1.6%Language:C++ 0.5%Language:Shell 0.4%