face enhancement questions

Question

face enhancement questions

tsing90 opened this issue 5 years ago · comments

tsing90 commented 5 years ago

Hi, glad to see your great work, really useful. Here I have two questions about face enhancement part:

when doing face crop, you set crop_size = 48 for 512-frame videos. But that may not be accurate as the person (or its head) can be either large or small due to the distance to the camera. Is there a better way to do the crop? thanks
Is there any official paper about face-gan? I didn't get any related reference from the 'everybody dance now' paper. And is there any other implementation about face-gan on github?

many thanks

Lingbo Yang · Answer 1 · Sat Jan 19 2019 21:19:41 GMT+0800 (China Standard Time)

For my training video a fixed crop_size is good enough. But of course you can adjust the crop_size since the face GAN is fully convolutional, which supports varying input size. You can roughly estimate the head size for each frame through the distance from head to neck.
I just borrowed a simple image restoration network for face enhancement. It's quite straightforward. You can find any other image enhancement network suit your needs. C'mon, it's just a 48-by-48 patch, how hard can this be?

tsing90 · Answer 2 · Mon Jan 21 2019 01:48:43 GMT+0800 (China Standard Time)

@Lotayou thanks for your reply, really reasonable.
For hand enhancement, do you think will the same method work? Due to fast movement, the hands sometimes are blurred, which affects the training result. Do you have any idea about solving this problem? thanks

Lingbo Yang · Answer 3 · Mon Jan 21 2019 10:26:20 GMT+0800 (China Standard Time)

@tsing90 I haven't got time for hand enhancements yet, but I think it's gonna be harder than face enhancement. The reasons are threefold:

Under most cases, people would pay more attention to facial details than hands.
Hands have more flexible movements and non-rigid deformations than heads, as human's fingers can cross and entwine in various ways.
In most cases hands are placed in front of the body, and enhancing the patch with a hand could very likely alter the background body texture as well, which could lead to blocking artifacts.

I guess it would be possible to enhance the hands of a certain person but I never tested it. Can you show me some of your results? Thanks

tsing90 · Answer 4 · Mon Jan 28 2019 02:16:18 GMT+0800 (China Standard Time)

@Lotayou Thanks for your reply. I trained the video of myself, so I may prefer not to share it here. For hand enhancement, actually, I found the key problem is different from face which needs fine tuning. Even I got the keypoints of hands (20 for each hand), the keypoints are not accurate (sometimes are missing!), the model is not able to learn it in the right way during training.
I'm thinking about using some tricks to make keypoints more meaningful.

Lingbo Yang · Answer 5 · Mon Jan 28 2019 09:24:37 GMT+0800 (China Standard Time)

I agree. Hands are very small objects and hand pose estimation cannot be very robust or accurate. By the way, can I just geek a peek on the hand enhancement results real quick? You don't need to expose your face:) Thx

tsing90 · Answer 6 · Tue Jan 29 2019 04:53:15 GMT+0800 (China Standard Time)

Here I attached the photo of my result, and feel free to give comments if you would like to know more. [PS: I will delete this photo when you have investigated it :) ]

Lingbo Yang · Answer 7 · Tue Jan 29 2019 09:54:23 GMT+0800 (China Standard Time)

Thanks for your photo! Now I see where the real problem lies: For face enhancement when you get a blurry result, it's obviously a fake. However for hands it's kinda hard to make the same judgement since hands in the original video can be pretty messed up too. This is especially the case for training GANs since the authenticity criterion does not depend on per-frame quality anymore. I think maybe it's better to focus on enforcing the temporal consistency, maybe introducing some RNN or C3D modules. Also it's possible to use longer temporal segments, since hand regions are much smaller than the whole frame.

BTW, feel free to delete the picture anytime you want:)

tsing90 · Answer 8 · Tue Jan 29 2019 20:26:17 GMT+0800 (China Standard Time)

thanks for your comments, recently I am going to try 3d poses instead of 2d for this task, which I believe more information can be learned.

Lingbo Yang · Answer 9 · Wed Jan 30 2019 14:26:42 GMT+0800 (China Standard Time)

Good luck! Keep me posted if you find anything interesting then.