Beast code in Giters

crlbajsoso's starred repositories

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT30051 428 4180

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT20479 200 371

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT19398 299 1344

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08444 130 1060

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT4398 58 149

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonMIT2331 31 112

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonApache-2.02190 45 393

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonApache-2.01972 50 126

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonMIT1546 64 21

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION1362 25 63

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonMIT1250 54 31

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonMIT887 23 33

visqol

Perceptual Quality Estimator for speech and audio

Language:C++Apache-2.0665 28 67

WavAugment

A library for speech data augmentation in time-domain

Language:PythonMIT634 26 17

audio-dataset

Audio Dataset for training CLAP and other models

Language:Python610 21 57

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:Python552 31 40

sof

Sound Open Firmware

Language:CNOASSERTION524 72 2009

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonMIT448 13 49

voicefixer_main

General Speech Restoration

Language:PythonMIT273 11 18

SoundStorm

The reproduced code for Google's SoundStorm

Language:Python236 20 27

beamformers

Easy to use Beamformers for multi-channel speech separation/enhancement

Language:PythonMIT175 4 4

LFM

Official PyTorch implementation of the paper: Flow Matching in Latent Space

Language:PythonAGPL-3.0173 9 15

gtcrn

The official implementation of GTCRN, an ultra-lite speech enhancement model.

Language:PythonMIT165 5 35

mfa-models

Collection of pretrained models for the Montreal Forced Aligner

Language:PythonCC-BY-4.0105 7 20

dnn_wpe

Language:PythonNOASSERTION97 5 5

torchiva

Blind source separation with independent vector analysis family of algorithm in torch

Language:PythonMIT85 5 3

INF-Generator

Generating sensor signals in isotropic noise fields

Language:MATLABGPL-3.043 1 2

PHASEN-PyTorch

Language:Python39 2 3

DR-DiffuSE

Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.

Language:Python3500

DiffuSE

Language:PythonApache-2.033 20