qiuqiangkong

followers

following

stars

qiuqiangkong's starred repositories

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookNOASSERTION66869 555 706

pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Language:PythonNOASSERTION8537 150 1544

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08304 130 1054

denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Language:PythonMIT7638 32 284

riffusion

Stable diffusion for real-time music generation

Language:PythonMIT3311 38 93

habitat-sim

A flexible, high-performance 3D simulator for Embodied AI research.

Language:C++MIT2488 82 757

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonNOASSERTION1288 24 143

minDiffusion

Self-contained, minimalistic implementation of diffusion models with Pytorch.

Language:Python808 9 7

musika

Fast Infinite Waveform Music Generation

Language:PythonMIT663 24 39

ontology

The Audio Set Ontology aims to provide a comprehensive set of categories to describe sound events.

audio-dataset

Audio Dataset for training CLAP and other models

Language:Python606 21 57

VAE-CVAE-MNIST

Variational Autoencoder and Conditional Variational Autoencoder on MNIST in PyTorch

Language:Python569 8 5

AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Language:PythonNOASSERTION510 34 27

sound-spaces

A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.

Language:PythonCC-BY-4.0328 16 138

gqn-datasets

Datasets used to train Generative Query Networks (GQNs) in the ‘Neural Scene Representation and Rendering’ paper.

Language:PythonApache-2.0271 18 16

diffq

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Language:PythonNOASSERTION230 11 8

EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

Language:PythonMIT205 5 27

Spherical-Array-Processing

A collection of MATLAB routines for acoustical array processing on spherical harmonic signals, commonly captured with a spherical microphone array.

Language:MATLABBSD-3-Clause162 12 2

SPTS

Official implementation of SPTS: Single-Point Text Spotting (ACM MM 2022 Oral)

Language:Python135 7 15

AudioLoader

PyTorch Dataset for Speech and Music audio

Language:Python74 4 4

audio-visual

Language:CMIT57 11 10

AudioTaggingDoneRight

experiments about AudioSet

Language:Jupyter NotebookNOASSERTION43 60

Neural-Scene-Representation-and-Rendering

Generative Query Network for rendering 3D scenes from 2D images

Language:PythonMIT43 30

DCASE2022-data-generator

Data generator for creating synthetic audio mixtures suitable for DCASE Challenge 2022 Task 3

Language:PythonNOASSERTION28 3 7

DCASE_2022_Task_5

System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection

Language:Python27 3 1

gqn-dataset-renderer

Language:PythonMIT27 3 3

DCASE2022-TASK3

Language:Python25 10

Hybrid-system-of-frame-wise-model-and-SEDT

Language:PythonNOASSERTION2200

rlr-audio-propagation

Audio propagation engine - Meta Reality Labs Research.

Language:C++NOASSERTION17 5 4

ERGL

8 10