Beast code in Giters

powei-C's starred repositories

Languagecodec

Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

Language:PythonMIT18600

MQTTS

Language:PythonMIT24200

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language:PythonMIT32300

SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Language:PythonApache-2.038500

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonMIT226000

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonMIT105700

BigVGAN

BigVGAN with Neural Source-Filter

Language:PythonMIT5000

singaligner

a compact audio-to-phoneme aligner for singing voice

Language:Python900

golf

A DDSP-based neural voice synthesiser.

Language:Jupyter NotebookMIT9000

BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Language:PythonMIT81000

python-MCD

Language:Python4300

so-vits-svc-4.0-v2

SoftVC VITS Singing Voice Conversion

Language:PythonMIT54800

so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Language:PythonNOASSERTION859300

Towards-Training-Explainable-Singing-Quality-Assessment-Network-with-Augmented-Data

Codes for paper -- Towards Training Explainable Singing Quality Assessment Network with Augmented Data

Language:Python1300

SingingVoice-Auto-Alignment-Revised

revised version of the workflow of auto annotation

Language:Jupyter Notebook400

NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Language:PythonMIT63300

PHONEix

PHONEix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation with Phoneme Distribution Predictor

500

phonemizer

Simple text to phones converter for multiple languages

Language:PythonGPL-3.0116500

USVG

A unified model for zero-shot singing voice conversion and synthesis

Language:Python2100

GMVAE

Implementation of Gaussian Mixture Variational Autoencoder (GMVAE) for Unsupervised Clustering

Language:PythonMIT29800

imitation-learning

Imitation learning algorithms

Language:PythonMIT40900

tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Language:Jupyter NotebookBSD-3-Clause500600

RL-pytorch

Implemention of reinforcment learning by pytorch

Language:PythonMIT900

DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Language:PythonApache-2.0285300

lets-do-irl

Inverse RL algorithms (APP, MaxEnt, GAIL, VAIL)

Language:PythonMIT67700

diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Language:PythonApache-2.074200

diffwave-sashimi

Implementation of DiffWave and SaShiMi audio generation models

Language:PythonMIT11200

DiffWave-Vocoder

Pytorch Reimplementation of DiffWave Vocoder: a high quality, fast, and small neural vocoder.

Language:PythonMIT8500

DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

Language:PythonApache-2.0264000

iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Language:PythonApache-2.021400