Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior

This is our official implementation of the paper

"Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior"

by Jaehoon Ko, Kyusun Cho, Joungbin Lee, Heeji Yoon, Sangmin Lee, Sangjun Ahn, Seungryong Kim^†

† : Corresponding Author

Introduction

We introduce a novel framework (Talk3D) for 3D-aware talking head synthesis!

For more information, please check out our Paper and our Project page.

Installation

We implemented & tested Talk3D with NVIDIA RTX 3090 and A6000 GPU.

Run the below codes for the environment setting. ( details are in requirements.txt )

conda create -n talk3d python==3.8
conda activate talk3d
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install absl-py basicsr cachetools clean-fid click cryptography dlib einops facexlib ffmpeg-python fvcore gfpgan h5py imageio imageio-ffmpeg ipykernel joblib keras librosa lmdb lpips matplotlib moviepy mrcfile mtcnn opencv-python pandas Pillow ply pydantic pytorch-fid pytorch-msssim realesrgan requests scipy scikit-learn six soundfile tensorboard tensorflow tqdm wandb yacs trimesh transformers kornia positional-encodings[pytorch] face_alignment

Preparation

Please check docs/download_models and follow the steps for preparation.

Download Dataset

Please check docs/download_dataset and follow the steps for preprocessing.

Dataset Preprocessing

Please check docs/preprocessing and follow the steps for preprocessing.

Training

To train (Talk3D), you can use scripts do_train.sh after changing some of the configs as:

--saveroot_path {path/to/save/directory} --data_root_dir {path/to/data/directory} --personal_id {video name}

For instance, you may edit the script as:

--saveroot_path {./savepoint} --data_root_dir {./data} --personal_id {May}

And run the bash script as below:

sh do_train.sh

Acknowledgement

We would like to acknowledge the contributions of EG3D and VIVE3D for code implementation.

Citation

If you find our work helpful, please cite our work as:

@misc{ko2024talk3d,
      title={Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior}, 
      author={Jaehoon Ko and Kyusun Cho and Joungbin Lee and Heeji Yoon and Sangmin Lee and Sangjun Ahn and Seungryong Kim},
      year={2024},
      eprint={2403.20153},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

KU-CVLAB / Talk3D