Bark audio model and talking head additions
sarutobiumon opened this issue · comments
- Would be amazing if you can:
- Turn the "talking head" images into animated gifs lip-sync'ed to the wav audio generated by tts using Bark (Bark is currently the best and most realstic/emotion-driven audio model that is free to use, even better than the best commercial closed source model Eleven Labs)
- Then generating an mp4 from the combination of animated gif and wav audio on the fly, replacing the starting-point animated gif on the screen.
This can be done by integrating code from one of the following choices:
- from this one-click install GUI: https://www.youtube.com/watch?v=f_NUZDBiaZg
- or using Sadtalker: https://www.youtube.com/watch?v=aJIq_UoZv24
- or this google colab python code below (supports 30+ languages):
https://spltech.co.uk/using-wav2lip-and-google-cloud-wavenet-to-create-voice-overs-in-more-than-30-languages/ - or using VideoReTalking:Audio-based Lip Synchronization for Talking Head Video
https://colab.research.google.com/github/vinthony/video-retalking/blob/main/quick_demo.ipynb
Demo https://www.youtube.com/watch?v=CgZVKSkdtRo
Bark oobabooga tts extention:
https://github.com/wsippel/bark_tts
- Would be amazing if you can:
- Turn the "talking head" images into animated gifs lip-sync'ed to the wav audio generated by tts using Bark (Bark is currently the best and most realstic/emotion-driven audio model that is free to use, even better than the best commercial closed source model Eleven Labs)
- Then generating an mp4 from the combination of animated gif and wav audio on the fly, replacing the starting-point animated gif on the screen.
This can be done by integrating code from one of the following choices:
- from this one-click install GUI: https://www.youtube.com/watch?v=f_NUZDBiaZg
- or using Sadtalker: https://www.youtube.com/watch?v=aJIq_UoZv24
- or this google colab python code below (supports 30+ languages):
https://spltech.co.uk/using-wav2lip-and-google-cloud-wavenet-to-create-voice-overs-in-more-than-30-languages/- or using VideoReTalking:Audio-based Lip Synchronization for Talking Head Video
https://colab.research.google.com/github/vinthony/video-retalking/blob/main/quick_demo.ipynb
Demo https://www.youtube.com/watch?v=CgZVKSkdtRoBark oobabooga tts extention: https://github.com/wsippel/bark_tts
Hi, Thanks for your suggestions. We will try to add these models into AudioGPT as soon as.