PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Hyunsoo Cha Byungjun Kim Hanbyul Joo

Seoul National University

CVPR 2024

TL;DR

PEGASUS builds a personalized generative 3D face avatar from monocular video sources.

Paper | Project Page

News

[2024/04/23] Initial release.

Setup

NOTE: PEGASUS was tested in an Ubuntu 20.04, CUDA 11.8 environment. All experiments were conducted using eight RTX A6000 GPUs. Training can be significantly slower in environments that do not support multi-GPU setups.

Environment

We need many modified open-source modules, so please make the directory (e.g., $HOME/GitHub)

mkdir -p $HOME/GitHub/
cd $HOME/GitHub/
git clone https://github.com/snuvclab/pegasus.git
cd ./scripts
sudo chmod a+x ./install_conda.sh
./install_conda.sh

Dockerfile will be provided soon.

Downloads

NOTE: The preprocessing process and the files that need to be downloaded are heavily dependent on the preprocess instructions of IMAvatar.

Download FLAME pkl and sam_vit_h_4b8939.pth. Please register FLAME website first.

cd ./scripts
./download_data.sh

Download deca_model.tar and put into ./preprocess/DECA/data
Download modnet_webcam_portrait_matting.ckpt and put into ./preprocess/MODNet/pretrained/
Download 79999_iter.pth and put into ./preprocess/face-parsing.PyTorch/res/cp/

Synthetic DB Generation

To prepare

Currently, we cannot release the pretrained DB avatar and monocular video database $V^{db}$ due to an issue with our download server. Instead, we recommend several datasets or videos that can be used for our synthetic DB generation.

Weird: We highly recommend this YouTube channel. Most of our $V^{db}$ content is sourced from it.
Syuka World: Highly recommended for a diverse range of hat datasets.
Celebv-HQ: We did not use this dataset for our paper, but it contains high-quality monocular videos. We are concerned that it lacks a variety of head poses, so please choose cautiously.

There are certain conditions for using the synthetic database.

We recommend using at least 100 processed monocular videos to generate a synthetic database.
Exclude any frames with occlusions from the videos. We leverage frankmocap to detect the hand and YOLOv5 to detect the objects.
All of the videos should be cropped $512\times512$. We plan to release preprocessing code that automatically crops and excludes noisy frames.

Processing

NOTE: We largely follow IMAvatar's structure for datasets and training checkpoints.

mkdir -p ./data
mkdir -p ./data/datasets
mkdir -p ./data/experiments

Set the video file name as filename.mp4
Save the video to ./data/datasets/original_db/filename.mp4
Please run a script to create the monocular video database $V^{db}$. Be sure to edit the preferences at the top of the script."

sudo chmod a+x ./preprocess/*.sh
./preprocess/1_initial_original_db.sh

Generate DB Avatar using $V^{db}$. This script includes the rendering.

sudo chmod a+x ./run/db_avatar.sh
./run/db_avatar.sh

Generate synthetic database.

./preprocess/2_synthesis_eyebrows.sh
./preprocess/2_synthesis_eyes.sh
./preprocess/2_synthesis_hair.sh
./preprocess/2_synthesis_hat.sh
./preprocess/2_synthesis_mouth.sh
./preprocess/2_synthesis_nose.sh
./preprocess/3_source.sh

PEGASUS training

from scratch

./run/train.sh

pretrained model

We plan to release the pretrained model soon.

PEGASUS test

./run/test.sh

License

Codes are available only for non-commercial research purposes.

Acknowledgement

Our project is built based on PointAvatar. We sincerely thank the authors of

PointAvatar
I M Avatar
Face Parsing
DECA
FLAME for their amazing work and codes!

Citation

If you find our code useful, please cite our paper:

@InProceedings{Cha_2024_CVPR,
    author    = {Cha, Hyunsoo and Kim, Byungjun and Joo, Hanbyul},
    title     = {PEGASUS: Personalized Generative 3D Avatars with Composable Attributes},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {1072-1081}
}

About

Official Repository for CVPR 2024 paper PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Apache License 2.0

Languages

Language:Jupyter Notebook 85.0%Language:Python 14.0%Language:Cuda 0.6%Language:Shell 0.2%Language:C++ 0.1%Language:Makefile 0.0%Language:C 0.0%