Beast code in Giters

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Language:PythonApache-2.0346 17 7

images-that-sound

Official repo for Images that sound: a special spectrogram that can be seen as images and played as sound generated by diffusions

Language:PythonMIT19000

CV-VAE

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Language:Jupyter Notebook15300

zimtohrli

Language:Jupyter NotebookApache-2.0130 3 1

soundctm

Pytorch implementation of SoundCTM

Language:PythonMIT66 20

MIDI-LLM-tokenizer

Tools for converting .mid files into text for training large language models

Language:PythonMIT64 3 2

When-in-Rome

meta-corpus of and code library for the functional harmonic analysis of music

Language:Python53 3 54

m2d

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Language:Jupyter NotebookNOASSERTION53 3 6

LLM-RecSys

4700

micro-musicgen

a new family of super small music generation models focusing on experimental music and latent space exploration capabilities

Language:PythonMIT2700

A-LLMRec

Language:Python2500

musical-word-embedding

Musical Word Embedding for Music Tagging and Retrieval [IEEE TASLP]

Language:Jupyter Notebook2000

ARCH

ARCH: Audio Representations benCHmark

Language:PythonNOASSERTION19 20

Synchformer

Efficient synchronization from sparse cues

Language:PythonMIT17 20

ss-mpe

Code for the paper "Toward Fully Self-Supervised Multi-Pitch Estimation".

Language:PythonMIT1100

efficient-speech-codec

A lightweight efficient audio codec in 30MB with 30~170x compression ratio. Supports 16kHz mono speech audio.

Language:PythonMIT7 5 2

TenseMusic

Language:Jupyter Notebook500