yearnyeen ho's starred repositories

unsupervised_compositional_representations

ISMIR 24 Supplementary Material

Stargazers:9Issues:0Issues:0

music2latent

Encode and decode audio samples to/from compressed latent representations!

Language:PythonLicense:NOASSERTIONStargazers:95Issues:0Issues:0

thegluenote

TheGlueNote is representation model for note-wise music alignment.

Language:PythonLicense:Apache-2.0Stargazers:6Issues:0Issues:0

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookLicense:MITStargazers:11354Issues:0Issues:0

muchomusic

MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.

Language:Jupyter NotebookLicense:MITStargazers:14Issues:0Issues:0

beat_this

Accurate and general beat tracker

Language:PythonLicense:MITStargazers:36Issues:0Issues:0

ChordSync

Code for ChordSync, a conformer-based audio-to-chord synchroniser

Language:Jupyter NotebookLicense:MITStargazers:5Issues:0Issues:0

mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Language:PythonLicense:MITStargazers:601Issues:0Issues:0

jamendolyrics

Jamendo music dataset with time-aligned lyrics for lyrics alignment evaluation

Language:PythonLicense:NOASSERTIONStargazers:72Issues:0Issues:0

LyricWhiz

[ISMIR 2023] LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Language:PythonLicense:MITStargazers:37Issues:0Issues:0

mira

MiRA (Music Replication Assessment) tool is a model-independent open evaluation method based on four diverse audio music similarity metrics to assess exact data replication of the training set.

Language:PythonLicense:AGPL-3.0Stargazers:20Issues:0Issues:0

contriever

Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning

Language:PythonLicense:NOASSERTIONStargazers:649Issues:0Issues:0

rectified-flow-pytorch

Implementation of rectified flow and some of its followup research / improvements in Pytorch

Language:PythonLicense:MITStargazers:119Issues:0Issues:0

ttt-lm-pytorch

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Language:PythonLicense:MITStargazers:935Issues:0Issues:0
Language:PythonStargazers:22Issues:0Issues:0

GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Language:PythonLicense:Apache-2.0Stargazers:59Issues:0Issues:0

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

Language:PythonLicense:NOASSERTIONStargazers:115Issues:0Issues:0

notebooks

Notebooks using the Hugging Face libraries 🤗

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3524Issues:0Issues:0

1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:358Issues:0Issues:0

PianoMotion10M

Code release for PianoMotion10M

Language:PythonLicense:Apache-2.0Stargazers:48Issues:0Issues:0

LLM101n

LLM101n: Let's build a Storyteller

Stargazers:27172Issues:0Issues:0

CompA

Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models

Language:PythonStargazers:11Issues:0Issues:0

FreeV

[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

Language:PythonLicense:MITStargazers:67Issues:0Issues:0

LLM-Codec

The open source code for LLM-Codec

Language:PythonStargazers:104Issues:0Issues:0

soundctm

Pytorch implementation of SoundCTM

Language:PythonLicense:MITStargazers:69Issues:0Issues:0

m2d

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:61Issues:0Issues:0

SparsePrimingRepresentations

Public repo to document some SPR stuff

License:MITStargazers:713Issues:0Issues:0

big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2139Issues:0Issues:0

CV-VAE

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Language:Jupyter NotebookStargazers:196Issues:0Issues:0

Synchformer

Efficient synchronization from sparse cues

Language:PythonLicense:MITStargazers:22Issues:0Issues:0