snuvclab / pegasus

Official Repository for CVPR 2024 paper PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Seoul National University

TL;DR

PEGASUS builds a personalized generative 3D face avatar from monocular video sources.

News

  • [2024/04/23] Initial release.

Setup

NOTE: PEGASUS was tested in an Ubuntu 20.04, CUDA 11.8 environment. All experiments were conducted using eight RTX A6000 GPUs. Training can be significantly slower in environments that do not support multi-GPU setups.

Environment

We need many modified open-source modules, so please make the directory (e.g., $HOME/GitHub)

mkdir -p $HOME/GitHub/
cd $HOME/GitHub/
git clone https://github.com/snuvclab/pegasus.git
cd ./scripts
sudo chmod a+x ./install_conda.sh
./install_conda.sh

Dockerfile will be provided soon.

Downloads

NOTE: The preprocessing process and the files that need to be downloaded are heavily dependent on the preprocess instructions of IMAvatar.

  • Download FLAME pkl and sam_vit_h_4b8939.pth. Please register FLAME website first.
cd ./scripts
./download_data.sh

Synthetic DB Generation

To prepare

Currently, we cannot release the pretrained DB avatar and monocular video database $V^{db}$ due to an issue with our download server. Instead, we recommend several datasets or videos that can be used for our synthetic DB generation.

  1. Weird: We highly recommend this YouTube channel. Most of our $V^{db}$ content is sourced from it.
  2. Syuka World: Highly recommended for a diverse range of hat datasets.
  3. Celebv-HQ: We did not use this dataset for our paper, but it contains high-quality monocular videos. We are concerned that it lacks a variety of head poses, so please choose cautiously.

There are certain conditions for using the synthetic database.

  1. We recommend using at least 100 processed monocular videos to generate a synthetic database.
  2. Exclude any frames with occlusions from the videos. We leverage frankmocap to detect the hand and YOLOv5 to detect the objects.
  3. All of the videos should be cropped $512\times512$. We plan to release preprocessing code that automatically crops and excludes noisy frames.

Processing

NOTE: We largely follow IMAvatar's structure for datasets and training checkpoints.

mkdir -p ./data
mkdir -p ./data/datasets
mkdir -p ./data/experiments
  1. Set the video file name as filename.mp4
  2. Save the video to ./data/datasets/original_db/filename.mp4
  3. Please run a script to create the monocular video database $V^{db}$. Be sure to edit the preferences at the top of the script."
sudo chmod a+x ./preprocess/*.sh
./preprocess/1_initial_original_db.sh
  1. Generate DB Avatar using $V^{db}$. This script includes the rendering.
sudo chmod a+x ./run/db_avatar.sh
./run/db_avatar.sh
  1. Generate synthetic database.
./preprocess/2_synthesis_eyebrows.sh
./preprocess/2_synthesis_eyes.sh
./preprocess/2_synthesis_hair.sh
./preprocess/2_synthesis_hat.sh
./preprocess/2_synthesis_mouth.sh
./preprocess/2_synthesis_nose.sh
./preprocess/3_source.sh

PEGASUS training

from scratch

./run/train.sh

pretrained model

We plan to release the pretrained model soon.

PEGASUS test

./run/test.sh

License

Codes are available only for non-commercial research purposes.

Acknowledgement

Our project is built based on PointAvatar. We sincerely thank the authors of

Citation

If you find our code useful, please cite our paper:

@InProceedings{Cha_2024_CVPR,
    author    = {Cha, Hyunsoo and Kim, Byungjun and Joo, Hanbyul},
    title     = {PEGASUS: Personalized Generative 3D Avatars with Composable Attributes},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {1072-1081}
}

About

Official Repository for CVPR 2024 paper PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

License:Apache License 2.0


Languages

Language:Jupyter Notebook 85.0%Language:Python 14.0%Language:Cuda 0.6%Language:Shell 0.2%Language:C++ 0.1%Language:Makefile 0.0%Language:C 0.0%