gaetan-sony

followers

following

stars

gaetan-sony's starred repositories

reapy

A pythonic wrapper for REAPER's ReaScript Python API

Language:PythonMIT10700

LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Language:PythonMIT68200

Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Language:PythonApache-2.074500

Pengi

An Audio Language model for Audio Tasks

Language:PythonMIT28100

audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Language:PythonMIT16800

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02672200

edm2

Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)

Language:PythonNOASSERTION47900

faiss

A library for efficient similarity search and clustering of dense vectors.

Language:C++MIT3048100

MetaMIDIDataset

Language:Python12000

genmusic_demo_list

a list of demo websites for automatic music generation research

AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

Language:PythonNOASSERTION40200

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT444900

ipv6-wsl

Language:C#MIT3500

NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Language:CudaNOASSERTION34100

pyloudnorm

Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm

Language:PythonMIT62100

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonMIT428300

AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Language:PythonApache-2.0453200

VQ-Diffusion

Official implementation of VQ-Diffusion

Language:PythonMIT87700

music-inpainting-ts

A collection of web interfaces for AI-assisted interactive music creation

Language:TypeScriptGPL-3.011000

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT6751100

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookMIT570300

scdl

Soundcloud Music Downloader

Language:PythonGPL-2.0329400

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookBSD-3-Clause965300

PerceptualSimilarity

LPIPS metric. pip install lpips

Language:PythonBSD-2-Clause359700