Test result with custom AI generated image

Question

Test result with custom AI generated image

nitinmukesh opened this issue 22 days ago · comments

Source image

Cropped image (crop_image2.py). This is used as input

input audio (existing audio)
AniTalker\test_demos\audios\english_female.wav

Prompt
python ./code/demo.py --infer_type mfcc_pose_only --stage1_checkpoint_path "./ckpts/stage1.ckpt" --stage2_checkpoint_path "./ckpts/stage2_pose_only_mfcc.ckpt" --test_image_path "./test_demos/portraits/crop_image.png" --test_audio_path "./test_demos/audios/english_female.wav" --result_path "./outputs/crop_mfcc/" --control_flag --s
eed 0 --pose_yaw 0 --pose_pitch 0 --pose_roll 0

Output without --face_sr

crop_image-english_female.mp4

Output with --face_sr

crop_image-english_female_SR.mp4

Output with custom audio and --face_sr switch

crop_image-intro_audio_SR.mp4

Tao Liu · Answer 1 · Wed Jul 31 2024 21:28:50 GMT+0800 (China Standard Time)

Thank you for your test.

We expect you to try the Hubert model. We have updated the library instructions and uploaded the Hubert model. You can directly test your own audio with the Hubert feature now. The usage method is as follows:

Re-download the checkpoint with the Hubert model into the ckpts directory and additionally install pip install transformers==4.19.2. When the code does not detect the Hubert path, it will automatically extract it and provide extra instructions on how to resolve any errors encountered.

We are also seeking better crop algorithms, as we have found that the initial position of faces in the frame has a significant impact. We will continue to monitor your results and crop algorithms. Thanks.

Tao Liu · Answer 2 · Wed Jul 31 2024 22:12:58 GMT+0800 (China Standard Time)

Hi, we run the Hubert model for your reference:

aiface2-AniTalker_intro_audio.mp4

python ./code/demo.py \
    --infer_type 'hubert_audio_only' \
    --stage1_checkpoint_path 'ckpts/stage1.ckpt' \
    --stage2_checkpoint_path 'ckpts/stage2_audio_only_hubert.ckpt' \
    --test_image_path 'test_demos/portraits/aiface2.png' \
    --test_audio_path 'test_demos/audios/AniTalker_intro_audio.wav' \
    --test_hubert_path 'no_path_extracting_online' \
    --result_path 'outputs/aiface2_hubert/'

Tao Liu · Answer 3 · Wed Jul 31 2024 22:15:46 GMT+0800 (China Standard Time)

Here is the 512*512 with face super-resolution

aiface2-AniTalker_intro_audio_SR.mp4

python ./code/demo.py \
    --infer_type 'hubert_audio_only' \
    --stage1_checkpoint_path 'ckpts/stage1.ckpt' \
    --stage2_checkpoint_path 'ckpts/stage2_audio_only_hubert.ckpt' \
    --test_image_path 'test_demos/portraits/aiface2.png' \
    --test_audio_path 'test_demos/audios/AniTalker_intro_audio.wav' \
    --test_hubert_path 'extract_online' \
    --result_path 'outputs/aiface2_hubert/' \
    --face_sr

nitinmukesh · Answer 4 · Wed Jul 31 2024 22:16:44 GMT+0800 (China Standard Time)

@liutaocode
Thank you for sharing the output, looks good. Can't wait to try the updates.

You have mentioned

Re-download the checkpoint with the Hubert model into the ckpts directory

https://huggingface.co/taocode/anitalker_ckpts/tree/main
I checked the above and there are no updates in the models. Did you uploaded the new Hubert models or I am missing something. I see only chinese hubert large is updated

I am also recording part 2 of the video on how to crop the image and hopefully post it tonight. Thank you for your support and amazing work with the release of this tool.

Tao Liu · Answer 5 · Wed Jul 31 2024 22:48:55 GMT+0800 (China Standard Time)

Oh, sorry for the confusion. I upgraded the Hubert model for convenient audio feature extraction. The model itself hasn't been updated. I see that you've already run the video, so you probably don't need to download it again. The current code update is just to make it easier for you to use the Hubert model to test custom audio.

nitinmukesh · Answer 6 · Wed Jul 31 2024 23:12:25 GMT+0800 (China Standard Time)

Thank you for clarification.

So it's just matter of

git pull
and
pip install transformers==4.19.2

Arvrairobo · Answer 7 · Fri Aug 16 2024 00:27:13 GMT+0800 (China Standard Time)

@nitinmukesh thank you for your PR, but it is not cropping the image, i have applied your PR and it is running successfully but it is not cropping the image and due to that, if there is a full image (not just face) then anitalker is not able to generate talking face. could you please recheck cropping code? so it should automatically crop the image and use that cropped image as an input correct? but it is not working like that

Tao Liu · Answer 8 · Fri Aug 16 2024 01:17:36 GMT+0800 (China Standard Time)

@Arvrairobo

I'm very sorry for the trouble caused to everyone due to the crop issue. My crop algorithm is available at https://github.com/liutaocode/talking_face_preprocessing/blob/master/extract_cropped_faces.py. I previously attempted to integrate it, but postponed the plan because the crop algorithm depends on many environmental factors. Since I've been busy with other matters recently, if there are no PRs during this period, I will update it at the end of the month.

Arvrairobo · Answer 9 · Sat Aug 17 2024 01:38:11 GMT+0800 (China Standard Time)

@liutaocode i have seen that extract_cropped_faces.py code but that code is not extracting or cropping a face from the image but it seems it is cropping from a video frame. if you give me any lead or head start in that, i can code it up and send you PR

nitinmukesh · Answer 10 · Mon Aug 19 2024 00:50:08 GMT+0800 (China Standard Time)

@Arvrairobo
Do you have windows or linux?
If Windows, try the new update
https://github.com/X-LANCE/AniTalker/blob/main/md_docs/run_on_windows.md

It comes with auto-crop (just select the checkbox in WebUI)

If Linux let me know I will give additional steps to make it work.

Also log a new issue as this is closed