tencent-ailab / V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about stage2 training. Audio_projection layer doesn't seem to be trained successfully.

arceus-jia opened this issue · comments

I'm doing some similar training but I'm having some problems and would like to ask for your help
I see that you trained audio_projection/motion_module/attn2 for stage2 training.
And When I'm training, it seemed like it was mostly the motion_module that worked, and the audio related layers don't seem to be trained successfully.
The result is that even though the video is smooth, different audio inputs come out with the same mouth movements.
What are your tips for handling this? I've increased the weight of the mouth loss but that doesn't seem to work either.

Thanks.

I'm doing some similar training but I'm having some problems and would like to ask for your help I see that you trained audio_projection/motion_module/attn2 for stage2 training. And When I'm training, it seemed like it was mostly the motion_module that worked, and the audio related layers don't seem to be trained successfully. The result is that even though the video is smooth, different audio inputs come out with the same mouth movements. What are your tips for handling this? I've increased the weight of the mouth loss but that doesn't seem to work either.

Thanks.

Could you please let me see your training code, I have met some trouble in training?