Beast code in Giters

sunxh16's repositories

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:Python000

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT000

async_cosyvoice

使用vllm加速cosyvoice2的推理

Language:Jupyter NotebookApache-2.0000

book-text-to-speech

A book about Text-to-Speech (TTS) in Chinese.

Language:TeXApache-2.0000

ClariNet

A Pytorch Implementation of ClariNet

Language:PythonMIT010

Concatenate_wav

Concatenate wavs(for unit selection)

Language:C++000

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.0000

F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Language:PythonMIT000

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Language:PythonMIT010

FloWaveNet

A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"

Language:PythonMIT010

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT000

NeuralVoicePuppetry

This github contains the network architectures of NeuralVoicePuppetry.

Language:PythonNOASSERTION000

NNPACK

Acceleration package for neural networks on multi-core CPUs

Language:CBSD-2-Clause010

nonparaSeq2seqVC_code

Implementation code of non-parallel sequence-to-sequence VC

Language:PythonMIT010

onnxruntime

ONNX Runtime

Language:C++MIT020

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

Language:Jupyter NotebookMIT000

Python-Wrapper-for-World-Vocoder

A Python wrapper for the high-quality vocoder "World"

Language:PythonMIT020

rigl

End-to-end training of sparse deep neural networks with little-to-no performance loss.

Language:PythonApache-2.0000

seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

GPL-3.0000

SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

Language:PythonMIT000

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonBSD-3-Clause000

sp2si-code

Contains code for our work on speech to singing conversion (ICASSP 2020)

Language:Python000

SqueezeWave

Language:Python000

tacotron2_v1

DeepMind's Tacotron-2 Tensorflow implementation

Language:PythonMIT010

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.0000

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonMIT000

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Language:PythonMIT000

wav2letter

Facebook AI Research Automatic Speech Recognition Toolkit

Language:C++NOASSERTION020

waveglow

A Flow-based Generative Network for Speech Synthesis

Language:PythonBSD-3-Clause010

World

A high-quality speech analysis, manipulation and synthesis system

Language:C++NOASSERTION010