gaetan-sony's starred repositories

reapy

A pythonic wrapper for REAPER's ReaScript Python API

Language:PythonLicense:MITStargazers:107Issues:0Issues:0

LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Language:PythonLicense:MITStargazers:682Issues:0Issues:0

Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Language:PythonLicense:Apache-2.0Stargazers:745Issues:0Issues:0

Pengi

An Audio Language model for Audio Tasks

Language:PythonLicense:MITStargazers:281Issues:0Issues:0

audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Language:PythonLicense:MITStargazers:168Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:26722Issues:0Issues:0

edm2

Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)

Language:PythonLicense:NOASSERTIONStargazers:479Issues:0Issues:0

faiss

A library for efficient similarity search and clustering of dense vectors.

Language:C++License:MITStargazers:30481Issues:0Issues:0
Language:PythonStargazers:120Issues:0Issues:0

genmusic_demo_list

a list of demo websites for automatic music generation research

Stargazers:602Issues:0Issues:0

AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

Language:PythonLicense:NOASSERTIONStargazers:402Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4449Issues:0Issues:0
Language:C#License:MITStargazers:35Issues:0Issues:0

NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Language:CudaLicense:NOASSERTIONStargazers:341Issues:0Issues:0

pyloudnorm

Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm

Language:PythonLicense:MITStargazers:621Issues:0Issues:0

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonLicense:MITStargazers:4283Issues:0Issues:0

AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Language:PythonLicense:Apache-2.0Stargazers:4532Issues:0Issues:0

VQ-Diffusion

Official implementation of VQ-Diffusion

Language:PythonLicense:MITStargazers:877Issues:0Issues:0

music-inpainting-ts

A collection of web interfaces for AI-assisted interactive music creation

Language:TypeScriptLicense:GPL-3.0Stargazers:110Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:67511Issues:0Issues:0

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookLicense:MITStargazers:5703Issues:0Issues:0

scdl

Soundcloud Music Downloader

Language:PythonLicense:GPL-2.0Stargazers:3294Issues:0Issues:0

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9653Issues:0Issues:0

PerceptualSimilarity

LPIPS metric. pip install lpips

Language:PythonLicense:BSD-2-ClauseStargazers:3597Issues:0Issues:0