RenYurui / PIRender

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about .mat file when preparing training dataset

DaddyJin opened this issue · comments

Thank you for the exciting work.
When preparing the training dataset, I have questions about the .mat file.
The prepare_vox_lmdb.py file indicates that .mat file should contain keys of 'landmark', 'coeff' and 'transform_params', as shown below.
image

However, in Deep3DFaceRecon_pytorch, the keys in .mat file is the predicted 257-dimensional coefficients and 68 projected 2d facial landmarks.

I would like to know the correspondence between those two kind of 'keys' or exactly how you extract the .mat file corresponding to a mp4 file.

Thanks in advance!

Hi,
We use the tensorflow implementation of DeepFaceRecon to extract the coefficients of the videos.
The images are cropped before sending to DeepFaceRecon. Therefore, we use both the crop parameters and the 3DMM parameters to describe the target motions.
The 257-dimensional coefficients (file_mat['coeff']) along with the 5-dimensional crop parameters (file_mat['transform_params']) are extracted.

# the format of the coeff
def split_3dmm_coeff(file_mat):
        coeff_3dmm = file_mat['coeff']  # N*257
        id_coeff = coeff_3dmm[:,:80] #identity
        ex_coeff = coeff_3dmm[:,80:144] #expression
        tex_coeff = coeff_3dmm[:,144:224] #texture
        angles = coeff_3dmm[:,224:227] #euler angles for pose
        gamma = coeff_3dmm[:,227:254] #lighting
        translation = coeff_3dmm[:,254:257] #translation
        return 

# the format of the crop parameter
def split_crop_parameter(file_mat):
       crop_param = file_mat['transform_params']
       org_image_size_w, org_image_size_h, resize_ratio, trans_0, trans_1 = np.hsplit(crop_param.astype(np.float32), 5)
       return

For the transform_params is obtianed by this function.
Good Luck!
Yurui

Hi,
I face a similar situation as @DaddyJin , I also used pytorch version of Deep3DFaceRecon for dataset preparation. My question is "Is there significant performance difference if I do not put transform_params as inputs?"

Thank you!

Yes, the transform_params is very improtant.
The network needs to calculate the absolute location of the faces according to the transform_params.

Hi all,
We provide the 3dmm extraction scripts using DeepFaceRecon_pytorch.
Please check DatasetHelper for more details.
Let me know if you have any problems.
Yurui

Hi,
"The images are cropped before sending to DeepFaceRecon." Which size is used to crop face image from original full-sized video? 224x224 or 256x256?

In addtion, extract_kp_videos.py doesn't change the size of input face image. However, the deepfacerecon_pytorch used by face _recon_videos.py will resize input image to 224x224. In prepare_vox_lmdb.py, the image is resized to 256x256.

To summarize, if we follow the data process in your DatasetHelper , is there any difference for input face video with 224x224 and 256x256? Which size is used in your pipeline.