qiuqiangkong's starred repositories
stable-diffusion
A latent text-to-image diffusion model
speechbrain
A PyTorch-based Speech Toolkit
denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
habitat-sim
A flexible, high-performance 3D simulator for Embodied AI research.
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
minDiffusion
Self-contained, minimalistic implementation of diffusion models with Pytorch.
audio-dataset
Audio Dataset for training CLAP and other models
VAE-CVAE-MNIST
Variational Autoencoder and Conditional Variational Autoencoder on MNIST in PyTorch
sound-spaces
A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.
gqn-datasets
Datasets used to train Generative Query Networks (GQNs) in the ‘Neural Scene Representation and Rendering’ paper.
EfficientAT
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
Spherical-Array-Processing
A collection of MATLAB routines for acoustical array processing on spherical harmonic signals, commonly captured with a spherical microphone array.
AudioLoader
PyTorch Dataset for Speech and Music audio
AudioTaggingDoneRight
experiments about AudioSet
Neural-Scene-Representation-and-Rendering
Generative Query Network for rendering 3D scenes from 2D images
DCASE2022-data-generator
Data generator for creating synthetic audio mixtures suitable for DCASE Challenge 2022 Task 3
DCASE_2022_Task_5
System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection
rlr-audio-propagation
Audio propagation engine - Meta Reality Labs Research.