We are releasing our code and dataset regarding Portrait-Mode Video Recognition research. The videos are sourced from Douyin platform. We distribute video content through the provision of links. Users are responsible for downloading the videos independently.
The high-quality videos are filtered by humans, with human activities across wide-spread categories.
ππ Thanks for the support from the community. Please check the issue here for cached videos on OneDrive.
ππ Please check the annotation at Uniformer/data_list/PMV/
and text description of the categories at data/class_name_mapping.csv
Please check our released taxonomy here. There is also an interactive demo of the taxonomy here.
We assume two directories for this project. {CODE_DIR}
for the code respository; {PROJ_DIR}
for the model logs, checkpoints and dataset.
To start with, please clone our code from Github
git clone https://github.com/bytedance/Portrait-Mode-Video.git {CODE_DIR}
We train our model with Python 3.7.3 and Pytorch 1.10.0. Please use the following command to install the packages used for our project. First install pytorch following the official instructions. Then install other packages by
pip3 install -r requirements.txt
Please refer to DATA.md for data downloading. We assume the videos are stored under {PROJ_DIR}/PMV_dataset
. Category IDs for the released videos are under {CODE_DIR}/MViT/data_list/PMV
and {CODE_DIR}/Uniformer/data_list/PMV
.
We provide bash scripts for training models using our PMV-400 data, as in exps/PMV/
. A demo running script is
bash exps/PMV/run_MViT_PMV.sh
For each model, e.g., MViT
, we provide the scripts for different training recipes in a single bash scripts, e.g., exps/PMV/run_MViT_PMV.sh
. Please choose the one suiting your purpose.
Note that you should set some environment variables in the bash scripts, such as WORKER_0_HOST
, WORKER_NUM
and WORKER_ID
in run_SlowFast_MViTv2_S_16x4_PMV_release.sh
; PROJ_DIR
in run_{model}_PMV.sh
.
We provide inference scripts for obtaining the report results in our paper. We also provide the trained model checkpoints.
Our code is licensed under an Apache 2.0 License. Our data is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. The data is released for non-commercial research purposes only.
By engaging in the downloading process, users are considered to have agreed to comply with our distribution license terms and conditions.
We would like to extend our thanks to the teams behind SlowFast code repository, 3Massiv, Kinetics and Uniformer. Our work builds upon their valuable contributions. Please acknowledge these resources in your work.