Questions about .mat file when preparing training dataset
DaddyJin opened this issue · comments
Thank you for the exciting work.
When preparing the training dataset, I have questions about the .mat file.
The prepare_vox_lmdb.py file indicates that .mat file should contain keys of 'landmark', 'coeff' and 'transform_params', as shown below.
However, in Deep3DFaceRecon_pytorch, the keys in .mat file is the predicted 257-dimensional coefficients and 68 projected 2d facial landmarks.
I would like to know the correspondence between those two kind of 'keys' or exactly how you extract the .mat file corresponding to a mp4 file.
Thanks in advance!
Hi,
We use the tensorflow implementation of DeepFaceRecon to extract the coefficients of the videos.
The images are cropped before sending to DeepFaceRecon. Therefore, we use both the crop parameters and the 3DMM parameters to describe the target motions.
The 257-dimensional coefficients (file_mat['coeff']) along with the 5-dimensional crop parameters (file_mat['transform_params']) are extracted.
# the format of the coeff
def split_3dmm_coeff(file_mat):
coeff_3dmm = file_mat['coeff'] # N*257
id_coeff = coeff_3dmm[:,:80] #identity
ex_coeff = coeff_3dmm[:,80:144] #expression
tex_coeff = coeff_3dmm[:,144:224] #texture
angles = coeff_3dmm[:,224:227] #euler angles for pose
gamma = coeff_3dmm[:,227:254] #lighting
translation = coeff_3dmm[:,254:257] #translation
return
# the format of the crop parameter
def split_crop_parameter(file_mat):
crop_param = file_mat['transform_params']
org_image_size_w, org_image_size_h, resize_ratio, trans_0, trans_1 = np.hsplit(crop_param.astype(np.float32), 5)
return
For the transform_params is obtianed by this function.
Good Luck!
Yurui
Hi,
I face a similar situation as @DaddyJin , I also used pytorch version of Deep3DFaceRecon for dataset preparation. My question is "Is there significant performance difference if I do not put transform_params as inputs?"
Thank you!
Yes, the transform_params is very improtant.
The network needs to calculate the absolute location of the faces according to the transform_params.
Hi all,
We provide the 3dmm extraction scripts using DeepFaceRecon_pytorch.
Please check DatasetHelper for more details.
Let me know if you have any problems.
Yurui
Hi,
"The images are cropped before sending to DeepFaceRecon." Which size is used to crop face image from original full-sized video? 224x224 or 256x256?
In addtion, extract_kp_videos.py doesn't change the size of input face image. However, the deepfacerecon_pytorch used by face _recon_videos.py will resize input image to 224x224. In prepare_vox_lmdb.py, the image is resized to 256x256.
To summarize, if we follow the data process in your DatasetHelper , is there any difference for input face video with 224x224 and 256x256? Which size is used in your pipeline.