markyouyuren

markyouyuren

Geek Repo

Github PK Tool:Github PK Tool

markyouyuren's repositories

Avocodo

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

BeatNet

This repository contains the implementation of the AI-based "BeatNet" Joint beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. 2021's state-of-the-art online model - (ISMIR 2021).

Language:PythonLicense:CC-BY-4.0Stargazers:0Issues:1Issues:0

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

DiffSinger-1

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

GenerSpeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Linly-Talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

MixGAN-TTS

MixGAN-TTS: End-to-End Speech Synthesis Based on Diffusion Model

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

NATSpeech

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

NeuralSVB

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Language:PythonStargazers:0Issues:1Issues:0

PaddleBoBo

基于飞桨开发的虚拟主播

Stargazers:0Issues:1Issues:0

Realtime-Voice-Clone-Chinese

克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

SadTalker-Video-Lip-Sync

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。

Language:PythonStargazers:0Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

speechgpt

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

Language:TypeScriptLicense:MITStargazers:0Issues:0Issues:0

TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

ttsmms

TTS with The Massively Multilingual Speech (MMS) project

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

VAEJETS

Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

VI-SVS

Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

vispeech

基于vits fastspeech2 visinger的tts模型

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

VITSinger

Singing Voice Speech modeling test

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

whisperer

Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.

Language:Jupyter NotebookStargazers:0Issues:1Issues:0