hbwu-ntu

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.07418 97 1483

RepDistiller

[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods

Language:PythonBSD-2-Clause2119 17 56

awesome_lists

Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)

Language:PythonMIT1393 33 1

sherpa-ncnn

Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, LicheePi4A etc.

Language:C++Apache-2.0915 36 136

torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Language:PythonMIT908 11 105

PromptCLUE

PromptCLUE, 全中文任务支持零样本学习模型

Language:Jupyter NotebookNOASSERTION645 9 19

pytorch-domain-adaptation

A collection of implementations of adversarial domain adaptation algorithms

Language:PythonMIT597 12 11

gpuRIR

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

Language:CudaAGPL-3.0466 10 51

ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Language:PythonBSD-3-Clause358 7 34

AudioClassification-Pytorch

The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.

Language:PythonApache-2.0352 6 27

PaSST

Efficient Training of Audio Transformers with Patchout

Language:PythonApache-2.0287 4 46

panns_inference

Language:PythonMIT186 4 15

beamformers

Easy to use Beamformers for multi-channel speech separation/enhancement

Language:PythonMIT171 4 4

GradTTS

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Language:PythonMIT169 5 3

interspeech2022

160 40

MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Language:PythonMIT158 15 9

audioset-processing

Toolkit for downloading and processing Google's AudioSet dataset.

Language:Jupyter NotebookMIT153 3 6

psla

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Language:PythonBSD-3-Clause131 1 12

DCCRN-with-various-loss-functions

DCCRN with various loss functions

Language:PythonMIT89 1 8

Mockingjay-Speech-Representation

Official Implementation of Mockingjay in Pytorch

Language:PythonMIT52 5 2

SpeakerGuard

a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition" accepted by TDSC

Language:Python33 2 6

SpecAugment-plus

A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification

Language:Python30 2 3

AVCleanse

ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'

Language:Python27 1 4

s3prl-private

Language:PythonApache-2.03 4 6

neiwen.github.io

Neiwen's homepage

Language:JavaScriptMIT200