LeDSantos/DanceRevolution

Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning

********* June 19, 2020 *********
The code and data are going through the internal review and will be released later!

********* August 26, 2020 *********
The dataset is still going through the internal review, please wait.

********* September 7, 2020 *********
The code & pose data are released!

Introduction

This repo is the PyTorch implementation of "Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning". Our proposed approach significantly outperforms the existing methods and achieves the state-of-art performance in extensive experiments. It can generate creative long dance sequences, e.g., about 1-minute length under 15 FPS, from the input music clips, which are smooth, natural-looking, diverse, style-consistent and beat-matching with the music. This techique can be used to drive various 3D character models by 3D reconstruction and animation driving, and has the great potential for the virtual ads video generation on social medias like TikTok.

Paper

Ruozi Huang*, Huang Hu*, Wei Wu, Kei Sawada, Mi Zhang and Daxin Jiang
Dance Revolution: Long Sequence Dance Generation with Music via Curriculum Learning
[arXiv] [YouTube] [Project]

Requirements

Python 3.7
PyTorch 0.4.1

Dataset and Installation

We released the dance pose data and the corresponding audio data into here. The pose sequences are extracted from original dance videos with 30FPS while the audio data is m4a format. Note that, we develope a simple interpolation alogrithm to find missing keyjoints to reduce the noise in the pose data, which is introduced by the imperfect extraction of OpenPose.
If you plan to train the model with your own dance data, please install OpenPose for the human pose extraction.

Generated Example Videos

Ballet style

Hiphop style

Japanese Pop style

Photo-Realisitc Videos by vid2vid
We map the generated skeleton dances to the photo-realistic videos by vid2vid. Specifically, We record a random dance video of a team memebr to train the vid2vid model. Then we generate photo-realistic videos by feeding the generated skeleton dances to the trained vid2vid model. Note that, our team member has authorized us the usage of her portrait in following demos.

Driving 3D model by applying 3D human pose estimation and Unity animation to generated skeleton dances.

Citation

If you find this work useful for your research, please cite the following paper :-)

@article{huang2020dance,
  title={Dance Revolution: Long Sequence Dance Generation with Music via Curriculum Learning},
  author={Huang, Ruozi and Hu, Huang and Wu, Wei and Sawada, Kei and Zhang, Mi},
  journal={arXiv preprint arXiv:2006.06119},
  year={2020}
}

LeDSantos / DanceRevolution