barryhunt's starred repositories

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:32071Issues:275Issues:1063

Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM, Qwen 与 Llama 等)基于 Langchain 与 ChatGLM 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

Language:TypeScriptLicense:Apache-2.0Stargazers:29968Issues:281Issues:3568

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:29348Issues:188Issues:966

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:NOASSERTIONStargazers:27902Issues:166Issues:384

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:27338Issues:206Issues:206

Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

Language:PythonLicense:MITStargazers:21176Issues:157Issues:1518

Clash-for-Windows_Chinese

clash for windows汉化版. 提供clash for windows的汉化版, 汉化补丁及汉化版安装程序

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonLicense:NOASSERTIONStargazers:15634Issues:135Issues:615

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10555Issues:139Issues:338

so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Language:PythonLicense:NOASSERTIONStargazers:8552Issues:67Issues:357

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonLicense:AGPL-3.0Stargazers:7519Issues:46Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:7251Issues:88Issues:111

video-subtitle-extractor

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Language:PythonLicense:Apache-2.0Stargazers:5323Issues:44Issues:254

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:4780Issues:53Issues:937

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Language:PythonLicense:Apache-2.0Stargazers:4651Issues:39Issues:563

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4280Issues:59Issues:136

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:2885Issues:48Issues:60

ADeus

An open source AI wearable device that captures what you say and hear in the real world and then transcribes and stores it on your own server. You can then chat with Adeus using the app, and it will have all the right context about what you want to talk about - a truly personalized, personal AI.

Language:TypeScriptLicense:NOASSERTIONStargazers:2847Issues:54Issues:34

lyrebird

🦜 Simple and powerful voice changer for Linux, written with Python & GTK

Language:PythonLicense:MITStargazers:1825Issues:22Issues:110

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:1239Issues:55Issues:30

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonLicense:Apache-2.0Stargazers:947Issues:18Issues:84

VoiceprintRecognition-Pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

Language:PythonLicense:Apache-2.0Stargazers:699Issues:10Issues:65

book-text-to-speech

A book about Text-to-Speech (TTS) in Chinese.

Language:TeXLicense:Apache-2.0Stargazers:560Issues:7Issues:4

vits2_pytorch

unofficial vits2-TTS implementation in pytorch

Language:PythonLicense:MITStargazers:466Issues:25Issues:54
Language:PythonLicense:MITStargazers:355Issues:9Issues:15

AudioClassification-Pytorch

The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.

Language:PythonLicense:Apache-2.0Stargazers:346Issues:6Issues:27

RapidVideOCR

Extract video hard subtitles and automatically generate corresponding srt files.

Language:PythonLicense:Apache-2.0Stargazers:298Issues:3Issues:31

GPT-SoVITS-Inference

Inference Specialization

Language:PythonLicense:MITStargazers:219Issues:4Issues:0

audio-preprocess

Preprocess Audio for training

Language:PythonLicense:Apache-2.0Stargazers:193Issues:8Issues:6

RVC-Studio

The best looking and most functional webui for RVC related tasks. See website for UI demo:

Language:PythonLicense:MITStargazers:164Issues:6Issues:24