Imtiyaz Momin's repositories
bark-TTS
🔊 Text-Prompted Generative Audio Model
audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
ChatGPTFromKB
Intelligent customer support bot
cosmic-media-extension
Search millions of high-quality royalty-free stock photos, images, and videos from popular online media services.
DPE
[CVPR 2023] DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
faster-whisper
Faster Whisper transcription with CTranslate2
GeneFace
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
llm-answer-engine
Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Mixtral, Langchain, OpenAI, Brave & Serper
OpenVoice
Instant voice cloning by MyShell
pywinassistant
The first open source Large Action Model generalist Artificial Narrow Intelligence that controls completely human user interfaces by only using natural language. PyWinAssistant utilizes Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models.
Real-Time-Accent-Conversion
Real Time Foreign Accent Conversion
roomGPT
Upload a photo of your room to generate your dream room with AI.
roop
one-click deepfake (face swap)
SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
SadTalker-Video-Lip-Sync
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
Scrapegraph-ai
Python scraper based on AI
ShortGPT
AI framework for automating video and short content creation
stable-diffusion-webui
Stable Diffusion web UI
StableVideo
[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Storyblocks
✨ Experience the enchantment of Story Block: an open-source project merging AI text generation and image synthesis to create captivating video narratives. 📚🎥 Watch as your text prompts come to life with stunning visuals, exploring new frontiers in storytelling!
storyteller
Multimodal AI Story Teller, built with Stable Diffusion, GPT, and neural text-to-speech
stripe-sync-engine
Sync your Stripe account to you Postgres database.
text2cinemagraph
Official Pytorch implementation of Text2Cinemagraph: Synthesizing Artistic Cinemagraphs from Text
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
whisper.cpp
Port of OpenAI's Whisper model in C/C++