Beast code in Giters

markyouyuren's repositories

Avocodo

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

Language:PythonMIT010

This repository contains the implementation of the AI-based "BeatNet" Joint beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. 2021's state-of-the-art online model - (ISMIR 2021).

Language:PythonCC-BY-4.0010

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonAGPL-3.0000

DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Language:PythonMIT010

DiffSinger-1

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Language:PythonMIT010

GenerSpeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Language:PythonMIT010

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT000

Linly-Talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

Language:PythonMIT000

MixGAN-TTS

MixGAN-TTS: End-to-End Speech Synthesis Based on Diffusion Model

Language:PythonMIT010

MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Language:PythonMIT010

NATSpeech

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

Language:PythonMIT010

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonMIT000

NeuralSVB

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Language:Python010

PaddleBoBo

基于飞桨开发的虚拟主播

010

Realtime-Voice-Clone-Chinese

克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION010

SadTalker-Video-Lip-Sync

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形，设置面部区域可配置的增强方式进行合成唇形（人脸）区域画面增强，提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧，补充帧间合成唇形的动作过渡，使合成的唇形更为流畅、真实以及自然。

Language:Python000

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonMIT000

speechgpt

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

Language:TypeScriptMIT000

TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Language:PythonNOASSERTION010

ttsmms

TTS with The Massively Multilingual Speech (MMS) project

Language:PythonMIT000

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonMIT000

VAEJETS

Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

Language:Jupyter NotebookMIT010

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonApache-2.0010

VI-SVS

Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.

Language:PythonApache-2.0010

vispeech

基于vits fastspeech2 visinger的tts模型

Language:PythonMIT000

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonMIT010

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Language:PythonApache-2.0000

vits-with-pith

vits

Language:PythonMIT000

VITSinger

Singing Voice Speech modeling test

Language:PythonMIT010

whisperer

Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.

Language:Jupyter Notebook010