KonstantineGoudz

followers

following

stars

Konstantine Goudz's starred repositories

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT29937 426 4173

Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

Language:PythonMIT21803 162 1536

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonMIT16907 153 1206

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++MIT7723 76 152

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Language:PythonMIT7476 82 151

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT4339 58 141

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonMIT3602 47 173

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT3338 58 70

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonMIT2338 60 167

releases

GPL-3.02007 35 293

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonApache-2.01960 50 126

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonMIT1245 54 31

AVeryComfyNerd

ComfyUI related stuff and things

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT1143 57 50

papermage

library supporting NLP and CV research on scientific papers

Language:PythonApache-2.0654 9 31

bark-voice-cloning-HuBERT-quantizer

The code for the bark-voicecloning model. Training and inference.

Language:PythonMIT621 17 42

voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Language:PythonMIT569 50 25

HD-Painter

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Language:PythonMIT261 21 19

spear-tts-pytorch

Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch

Language:PythonMIT249 28 6

EfficientWord-Net

OneShot Learning-based hotword detection.

Language:Jupyter NotebookApache-2.0215 12 36

CoMoSpeech

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

Language:PythonMIT168 11 11

Bridge-TTS

Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).

UniCATS-CTX-vec2wav

[AAAI 2024] Code for CTX-vec2wav in UniCATS

Language:Python111 10 9

VALL-E-X-Trainer-by-CustomData

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Language:PythonMIT62 50

unicats

UniCATS-CTX-txt2vec

[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS

Language:Python57 7 11

bark-data-gen

Create training data for training a voice cloner for bark text to speech.

Language:Jupyter NotebookMIT44 3 4

audio-diffusion-pytorch-fork

Audio generation using diffusion models, in PyTorch.

Language:PythonMIT44 20

EDMSound

Codebase and project page for EDMSound

Language:PythonMIT2800

bark-with-voice-clone

🔊 Text-prompted Generative Audio Model - With the ability to clone voices

Language:Jupyter NotebookNOASSERTION20 10