Yuhang's repositories
SpeechAlgorithms
Speech Algorithms
APNet2
Source code of APNet2, a vocoder
awesome
😎 Awesome lists about all kinds of interesting topics
Awesome-GPT-Store
A collection of major GPTS available in public
ChatWaifu-marai
About Combined ChatGPT with Moegoe TTS to create a Chatting Waifu for Marai
CyberWaifu
GPT + Tacotron2/VITS + Live2D = CyberWaifu
distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Free-Certifications
A curated list of free courses & certifications.
g2p-zh-en
Chinese and English Bilinguish G2P
GPT-vup
GPT-vup BIliBili | 抖音 | AI | 虚拟主播
hackingtool
ALL IN ONE Hacking Tool For Hackers
IP_LAP
CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors
LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
megatts2
Unoffical implement of Megatts2
mustango
Mustango: Toward Controllable Text-to-Music Generation
OpenPhonemizer
Permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning.
RefAudioEmoTagger
一种基于Emotion2Vec的批量音频情感自动标注脚本
roop
one-click face swap
speech-synthesis-paper
List of speech synthesis papers.
speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
SpEx_Plus
SpEx+(tied) source code
stable-audio-tools
Generative models for conditional audio generation
stable-speech
Reproduction of Stability AI's Text-to-Speech model.
StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
w2v2-how-to
How to use our public wav2vec2 dimensional emotion model
wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。