pengbo807 / ConditionVideo

Training-Free Condition-Guided Text-to-Video Generation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation (AAAI 2024)

Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao

This is the official PyTorch implementation of paper "ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation"

Our model generates realistic dynamic videos from random noise or given scene videos based on given conditions. Currently, we support openpose keypoint, canny, depth and segment condition.

canny segment depth

A dog, comicbook style

A red jellyfish, pastel colours.

A horse under a blue sky.
pose customized pose

The Astronaut, brown background

Ironman in the sea

Setup

To install the environments, use:

conda create -n tune-control python=3.10

check cuda version then install the corresponding pytorch package, note that we need pytorch==2.0.0

pip install -r requirements.txt
conda install xformers -c xformers

You may also need to download model checkpoints manually from hugging-face.

Usage

To run the code, use

accelerate launch --num_processes 1 conditionvideo.py --config="configs//config.yaml"

for video generation, change the configuration in config.yaml for different generation settings.

Citation

@misc{peng2023conditionvideo,
      title={ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation}, 
      author={Bo Peng and Xinyuan Chen and Yaohui Wang and Chaochao Lu and Yu Qiao},
      year={2023},
      eprint={2310.07697},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Training-Free Condition-Guided Text-to-Video Generation

License:GNU General Public License v3.0


Languages

Language:Python 100.0%