CaoYuhang

Yuhang's repositories

SpeechAlgorithms

Speech Algorithms

Language:CApache-2.0100

APNet2

Source code of APNet2, a vocoder

Language:PythonMIT000

awesome

😎 Awesome lists about all kinds of interesting topics

CC0-1.0000

Awesome-GPT-Store

A collection of major GPTS available in public

MIT000

ChatWaifu-marai

About Combined ChatGPT with Moegoe TTS to create a Chatting Waifu for Marai

MIT000

CyberWaifu

GPT + Tacotron2/VITS + Live2D = CyberWaifu

MIT000

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

MIT000

Free-Certifications

A curated list of free courses & certifications.

MIT000

g2p-zh-en

Chinese and English Bilinguish G2P

NOASSERTION000

g2p_mix

MIT000

GPT-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

000

hackingtool

ALL IN ONE Hacking Tool For Hackers

Language:PythonMIT000

IP_LAP

CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors

Language:PythonApache-2.0000

LiveWhisper

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

Language:PythonMIT000

M2MeT2.0

000

megatts2

Unoffical implement of Megatts2

MIT000

mustango

Mustango: Toward Controllable Text-to-Music Generation

MIT000

OpenPhonemizer

Permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning.

BSD-3-Clause-Clear000

RefAudioEmoTagger

一种基于Emotion2Vec的批量音频情感自动标注脚本

GPL-3.0000

roop

one-click face swap

GPL-3.0000

SpatialCodec

000

speech-synthesis-paper

List of speech synthesis papers.

MIT000

speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

BSD-3-Clause000

SpEx_Plus

SpEx+(tied) source code

MIT000

stable-audio-tools

Generative models for conditional audio generation

MIT000

stable-speech

Reproduction of Stability AI's Text-to-Speech model.

Apache-2.0000

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

MIT000

voicefilter

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Language:Python000

w2v2-how-to

How to use our public wav2vec2 dimensional emotion model

MIT000

wukong-robot

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

MIT000