crlbajsoso's starred repositories

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:30051Issues:428Issues:4180

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20479Issues:200Issues:371

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19398Issues:299Issues:1344

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8444Issues:130Issues:1060

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4398Issues:58Issues:149

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonLicense:MITStargazers:2331Issues:31Issues:112

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonLicense:Apache-2.0Stargazers:2190Issues:45Issues:393

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonLicense:Apache-2.0Stargazers:1972Issues:50Issues:126

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonLicense:MITStargazers:1546Issues:64Issues:21

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1362Issues:25Issues:63

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:1250Issues:54Issues:31

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonLicense:MITStargazers:887Issues:23Issues:33

visqol

Perceptual Quality Estimator for speech and audio

Language:C++License:Apache-2.0Stargazers:665Issues:28Issues:67

WavAugment

A library for speech data augmentation in time-domain

Language:PythonLicense:MITStargazers:634Issues:26Issues:17

audio-dataset

Audio Dataset for training CLAP and other models

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

sof

Sound Open Firmware

Language:CLicense:NOASSERTIONStargazers:524Issues:72Issues:2009

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:448Issues:13Issues:49

voicefixer_main

General Speech Restoration

Language:PythonLicense:MITStargazers:273Issues:11Issues:18

SoundStorm

The reproduced code for Google's SoundStorm

beamformers

Easy to use Beamformers for multi-channel speech separation/enhancement

Language:PythonLicense:MITStargazers:175Issues:4Issues:4

LFM

Official PyTorch implementation of the paper: Flow Matching in Latent Space

Language:PythonLicense:AGPL-3.0Stargazers:173Issues:9Issues:15

gtcrn

The official implementation of GTCRN, an ultra-lite speech enhancement model.

Language:PythonLicense:MITStargazers:165Issues:5Issues:35

mfa-models

Collection of pretrained models for the Montreal Forced Aligner

Language:PythonLicense:CC-BY-4.0Stargazers:105Issues:7Issues:20
Language:PythonLicense:NOASSERTIONStargazers:97Issues:5Issues:5

torchiva

Blind source separation with independent vector analysis family of algorithm in torch

Language:PythonLicense:MITStargazers:85Issues:5Issues:3

INF-Generator

Generating sensor signals in isotropic noise fields

Language:MATLABLicense:GPL-3.0Stargazers:43Issues:1Issues:2

DR-DiffuSE

Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.

Language:PythonStargazers:35Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:33Issues:2Issues:0