AmorJNYH's repositories
audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
deepface
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
facefusion
Next generation face swapper and enhancer
FastSAG
FastSAG: A Diffusion Probabilistic Model for Singing Accompaniment Generation
FloatingX
Android免权限悬浮窗,支持全局(App内部)、局部悬浮(View),支持边缘吸附、回弹、自定义动画、位置保存、窗口化及分屏后位置修复等。Android without permission suspension window(App), support global(View), local suspension, support edge adsorption, rebound, custom animation, position saving, windowing and split-screen position repair.
ForwardTacotron
⏩ Generating speech in a single forward pass without any attention!
GeneFacePlusPlus
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
jitsi-meet
Jitsi Meet - Secure, Simple and Scalable Video Conferences that you use as a standalone app or embed in your web application.
LangSegment
It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
languagecodec_tmp
Temporary anonymous version
leedl-tutorial
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
MediaCrawler
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫
megatts2
Unoffical implementation of Megatts2
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
metahuman-stream
Real time streaming digital human based on nerf
metavoice-src
Foundational model for human-like, expressive TTS
NBSS
The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
parler-tts
Inference and training library for high-quality TTS models.
pflow-encodec
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
ppgs
High-Fidelity Neural Phonetic Posteriorgrams
pretty-midi
Utility functions for handling MIDI data in a nice/intuitive way.
supervoice-gpt
GPT-style network for phonemization with durations of text
ttts
Train the next generation of TTS systems.
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
WhisperLive
A nearly-live implementation of OpenAI's Whisper.
xsrp
DOA