AlessioSam / LADiff

The official PyTorch implementation of "The 18th European Conference on Computer Vision" (ECCV 2024) paper Length-Aware Motion Synthesis via Latent Diffusion.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

European Conference on Computer Vision 2024

Length-Aware Motion Synthesis via Latent Diffusion

Alessio Sampieri*, Alessio Palma*, Indro Spinelli, and Fabio Galasso

Sapienza University of Rome, Italy

[Paper]

Abstract

The target duration of a synthesized human motion is a critical attribute that requires modeling control over the motion dynamics and style. Speeding up an action performance is not merely fast-forwarding it. However, state-of-the-art techniques for human behavior synthesis have limited control over the target sequence length.

We introduce the problem of generating length-aware 3D human motion sequences from textual descriptors, and we propose a novel model to synthesize motions of variable target lengths, which we dub ``Length-Aware Latent Diffusion'' (LADiff). LADiff consists of two new modules: 1) a length-aware variational auto-encoder to learn motion representations with length-dependent latent codes; 2) a length-conforming latent diffusion model to generate motions with a richness of details that increases with the required target sequence length. LADiff significantly improves over the state-of-the-art across most of the existing motion synthesis metrics on the two established benchmarks of HumanML3D and KIT-ML.


Create the environment

conda create python=3.10 --name ladiff
conda activate ladiff

Install the packages in requirements.txt and install PyTorch 1.12.1

cd src
pip install -r requirements.txt

Run the scripts to download dependencies:

bash prepare/download_smpl_model.sh
bash prepare/prepare_clip.sh
bash prepare/download_t2m_evaluators.sh

Put datasets in the datasets folder, please refer to HumanML3D for setup.

We tested our code on Python 3.10.9 and PyTorch 1.12.1.


Pretrained model

Download the checkpoints trained on HumanML3D from the Google Drive, and place them in the experiments/ladiff folder.


Train your own model

For the stage 1 (LA-VAE) please first check the parameters in configs/config_vae_humanml3d.yaml, e.g. NAME,DEBUG.

Then, run the following command:

python -m train --cfg configs/config_vae_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug

For the stage 2 (LA-DDPM) please update the parameters in configs/config_ladiff_humanml3d.yaml, e.g. NAME,DEBUG,PRETRAINED_VAE (change to your latest ckpt model path in previous step)

Then, run the following command:

python -m train --cfg configs/config_ladiff_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 128  --nodebug

Evaluate the model

Please first put the trained model checkpoint path to TEST.CHECKPOINT in configs/config_ladiff_humanml3d.yaml.

Then, run the following command:

python -m test --cfg configs/config_ladiff_humanml3d.yaml --cfg_assets configs/assets.yaml

Citation

If you find our code or paper helpful, please consider citing us.


Acknowledgements

Thanks to MLD, our code is borrowing from them. Please visit their page for more instructions.

About

The official PyTorch implementation of "The 18th European Conference on Computer Vision" (ECCV 2024) paper Length-Aware Motion Synthesis via Latent Diffusion.

License:MIT License


Languages

Language:Python 99.4%Language:Shell 0.6%