temporal-action-segmentation end-to-end lightweight temporal-action-localization temporal-action-detection

Important!

Warning

This repo main branch are under development so it will have much bugs, because it doesn't test completely!

Note

If you want to reproduce paper list, please checkout branch to svtas-paper!

Paper List

Streaming Video Temporal Action Segmentation In Real Time, , statu: accepted by ISKE2023
End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning, , statu: under review

Streaming Video Temporal Action Segmentation

Our framework integrates training, inference and deployment services to meet the demand of streaming video temporal action segmentation, with the goal of creating an AI Infra framework for streaming video temporal action segmentation.

Installation

See the SVTAS installation guide to install from pip, or build from source.

To install the current release:

python setup.py install .

To update SVTAS to the latest version, add --upgrade flag to the above commands.

Framework Feature

	Training	Inference	Serving
Supports	Autotunning Hyper-parameter search Multi-trial Architecture search NAS Evaluation FLOPs Param Performance	Quantization Todo Pruning Todo Evaluation FLOPs Comapre Param Precision comparison	Server Todo Client Todo Evaluation Throughput Latency Resource
	Model Zoom	Tutorials	Services Component
Algorithms	MS-TCN ASFormer ASRF C2F-TCN Transeger Diffact More...	CAM Visualization DeepSpeed Tritron (Todo)	Local Machine Distribution Machine Assemble Pytest Framweork Tensorboard ONNXRuntime TensorRT

Envirnment Prepare

Linux Ubuntu 22.04+
Python 3.10+
PyTorch 2.1.0+
CUDA 12.2+
Pillow-SIMD (optional): Install it by the following scripts.
FFmpeg 4.3.1+ (optional): For extract flow and visualize video cam

conda uninstall -y --force pillow pil jpeg libtiff libjpeg-turbo
pip   uninstall -y         pillow pil jpeg libtiff libjpeg-turbo
conda install -yc conda-forge libjpeg-turbo
CFLAGS="${CFLAGS} -mavx2" pip install --upgrade --no-cache-dir --force-reinstall --no-binary :all: --compile pillow-simd
conda install -y jpeg libtiff

use pip to install environment

conda create -n torch python=3.10
python -m pip install --upgrade pip
pip install -r requirements/requirements_base.txt

If report correlation_cuda package no found, you should read Install
If you want to extract montion vector and residual image to video, you should install ffmpeg, for example, in ubuntu sudo apt install ffmpeg

Document Dictionary

Citation

@misc{2209.13808,
Author = {Wujun Wen and Yunheng Li and Zhuben Dong and Lin Feng and Wanxiao Yang and Shenlan Liu},
Title = {Streaming Video Temporal Action Segmentation In Real Time},
Year = {2022},
Eprint = {arXiv:2209.13808},
}

@article{wen2023end,
  title={End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning},
  author={Wen, Wujun and Zhang, Jinrong and Liu, Shenglan and Li, Yunheng and Li, Qifeng and Feng, Lin},
  journal={arXiv preprint arXiv:2309.15683},
  year={2023}
}

Acknowledgement

This repo borrowed code from many great open source libraries, thanks again for their selfless dedication.

License

The entire codebase is under Apache2.0 license.

About

End to End Streaming Video Temporal Segmentation

temporal-action-segmentation end-to-end lightweight temporal-action-localization temporal-action-detection

Apache License 2.0

Languages

Language:Python 98.8%Language:Cuda 0.7%Language:C++ 0.2%Language:Shell 0.2%