SizheAn / PanoHead

Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is Mask Actually Used in Inversion?

ChikaYan opened this issue · comments

Hi, thank you for the brilliant work!

May I just ask a quick question -- are the masks actually used during the PTI inversion? The code in projector_withseg.py seems to only read the image directly from the given path, without reading/using the provided masks at all, is this expected?

If so, may I ask if it is actually possible to utilize a mask during the inversion? At the moment, it seems that inversion would fail pretty badly if the image contains a large area of hair.

Thank you in advance!

Yeah I think if you just comment line

dataset_kwargs = dnnlib.EasyDict(class_name='training.dataset.MaskLabeledDataset', img_path=target_img, seg_path=target_seg, use_labels=True, max_size=None, xflip=False)
and uncomment line
# dataset_kwargs = dnnlib.EasyDict(class_name='training.dataset.ImageFolderDataset', path=target_fname, use_labels=True, max_size=None, xflip=False)
, change it to

dataset_kwargs = dnnlib.EasyDict(class_name='training.dataset.ImageFolderDataset', path=target_img, use_labels=True, max_size=None, xflip=False)

, the pti should still work.

Using the mask won't solve this problem ultimately, IMO. Inversion itself does not fail on finding the closest latent as you can see almost all the reconstruct images for frontal faces are high quality still. The problem occurs when we change the camera pose to side/back, which means the pretrained model's learned 3D prior is not good/generalizable enough. This is just my opinion, feel free to do something with masks and let us know! :)

Thx a lot! Yeah it makes sense