CaoYuhang

Yuhang's starred repositories

e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Language:PythonMIT16100

chatllama

ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT

Language:Python120200

detail_tts

All generative model in one for better TTS model

Language:Python3600

stable-audio-tools

Generative models for conditional audio generation

Language:PythonMIT233300

OpenPhonemizer

An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.

Language:PythonBSD-3-Clause-Clear7200

SpeechAlgorithms

Speech Algorithms

Apache-2.0100

mustango

Mustango: Toward Controllable Text-to-Music Generation

Language:PythonMIT30500

MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Language:PythonMIT410500

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02084300

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonApache-2.0135400

metavoice-src

Foundational model for human-like, expressive TTS

Language:PythonApache-2.0355000

gpt-neo

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Language:PythonMIT818000

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT2960400

DeepFilterNet

Noise supression using deep filtering

Language:PythonNOASSERTION220600

megatts2

Unoffical implementation of Megatts2

Language:PythonMIT24500

awesome

😎 Awesome lists about all kinds of interesting topics

CC0-1.031226500

hello-algo

《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version ongoing

Language:JavaNOASSERTION8910100

chatgpt_system_prompt

A collection of GPT system prompts and various prompt injection/leaking knowledge.

Language:HTMLMIT775700

SpeechAlgorithms

Speech Algorithms

Language:CApache-2.072900

streamlit-audio-recorder

Record Audio from the User's Microphone in Apps that are Deployed to the Web. (via Browser Media-API, REACT-based, Streamlit Custom Component)

Language:TypeScriptMIT39400

speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Language:PythonBSD-3-Clause820000

snowboy

Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy

Language:C++NOASSERTION304200

Mixly_Arduino

A visual programming editor based on blockly for Arduino、Microbit、MicroPython、Python

Language:CApache-2.024100

Free-Certifications

A curated list of free courses & certifications.

MIT2484300

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonApache-2.0615100

RemoveAdblockThing

The intrusive "Ad blocker are not allowed on YouTube" message is annoying. This open-source project aims to address this issue by providing a solution to bypass YouTube's ad blocker detection

Language:JavaScriptMIT595500

MoeGoe

Executable file for VITS inference

Language:PythonMIT231500

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonMIT49800

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonMIT151200

wukong-robot

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

Language:PythonMIT607300