Andong-Li-speech

AndongLi's starred repositories

langchain

🦜🔗 Build context-aware reasoning applications

Language:PythonMIT86818 662 6916

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonApache-2.011950 96 1018

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause11487 106 823

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Language:PythonMIT7346 82 148

text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Language:PythonApache-2.05963 104 405

guided-diffusion

Language:PythonMIT5759 143 133

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language:PythonNOASSERTION2294 41 98

CLAP

Contrastive Language-Audio Pretraining

Language:PythonCC0-1.01213 28 78

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonMIT963 26 56

RectifiedFlow

Official Implementation of Rectified Flow (ICLR2023 Spotlight)

Language:Python604 9 19

Meta-voicebox

Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.

MIT542 86 4

UniAudio

The Open Source Code of UniAudio

Language:Python460 39 27

Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Language:Jupyter NotebookMIT419 13 41

pytorch_ema

Tiny PyTorch library for maintaining a moving average of a collection of parameters.

Language:PythonMIT391 4 8

Text-to-sound-Synthesis

The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"

Language:Python336 17 27

DiffiT

Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation

330 51 2

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language:PythonMIT292 16 42

SoundStorm

The reproduced code for Google's SoundStorm

Language:Python228 20 26

VoiceLDM

VoiceLDM: Text-to-Speech with Environmental Context

Language:PythonApache-2.0132 7 3

EDiffSR

[IEEE TGRS 2024] EDiffSR: An Efficient Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution

Language:Python95 7 8

HGRN

[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Sequence Modeling

Language:Python58 2 2

Robust-E2E-ASR

This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.

Language:PythonMIT44 3 6

DOSE

DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023

Language:Python3201

Reti-Diff

29 5 4

DTLN-aec

DTLN net for acoustic echo cancellation

Language:Python29 2 1

DPMTSE

A Diffusion Probabilistic Model for Target Sound Extraction

Language:Python2500

pymcd

Package pymcd

Language:PythonMIT20 20

DCINN

Language:Python700

RubikCube

Language:Python5 20

Neural-Gradient-Regularizer

This repository contains official implementation of Neural Gradient Regularizer (NGR).

Language:Python400