Speech-Driven Facial Animation with Spectral Gathering and Temporal Attention

TODO

upload pca-pretrained
Dump mesh of each frame
Render with blender

Install dependencies

Necessary libraries:

# install python libs
$ python3 -m pip install -r requirements.txt
# install cmake and sndfile lib
$ sudo apt install libsndfile1 cmake

(not necessary) If you want to prepare dataset, montreal-forced-aligner must be installed. (Some errors may occur during installation, please pay attention.)

$ bash scripts/install_mkl.sh
$ bash scripts/install_kaldi.sh
$ bash scripts/install_mfa.sh

Prepare VOCASET

Download VOCASET from https://voca.is.tue.mpg.de/ Unzip directories:

| VOCASET
 -| unposedcleaneddata
 -| sentencestext
 -| templates
 -| audio

Run the preload python script.

python3 -m saberspeech.datasets.voca.preload\
    --source_root <ROOT_VOCASET> \
    --output_root <ROOT_PROCESSED>

Pre-trained models

dgrad
offsets
PCA of dgrad, offsets

Citation

@article{chai2022speech,
  title={Speech-driven facial animation with spectral gathering and temporal attention},
  author={Chai, Yujin and Weng, Yanlin and Wang, Lvdi and Zhou, Kun},
  journal={Frontiers of Computer Science},
  volume={16},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}

About

PyTorch Implementation of our paper "Speech-Driven Facial Animation with Spectral Gathering and Temporal Attention" published in Springer FCS.

https://chaiyujin.github.io/sdfa/

Languages

Language:C++ 79.0%Language:Fortran 9.6%Language:Python 5.4%Language:CMake 2.8%Language:C 1.9%Language:Cuda 0.9%Language:Shell 0.2%Language:JavaScript 0.1%Language:Makefile 0.1%Language:CSS 0.0%