yearnyeen ho's starred repositories

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23089Issues:251Issues:273

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:22356Issues:182Issues:176

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:10596Issues:74Issues:12

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonLicense:BSD-3-ClauseStargazers:3469Issues:40Issues:336

big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1942Issues:35Issues:46

benchmark_VAE

Unifying Variational Autoencoder (VAE) implementations in Pytorch (NeurIPS 2022)

Language:PythonLicense:Apache-2.0Stargazers:1709Issues:18Issues:59

score_sde_pytorch

PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1567Issues:17Issues:55

visualization-curriculum

A data visualization curriculum of interactive notebooks.

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1269Issues:54Issues:13

InstaFlow

:zap: InstaFlow! One-Step Stable Diffusion with Rectified Flow (ICLR 2024)

Language:PythonLicense:MITStargazers:1042Issues:43Issues:26

SparsePrimingRepresentations

Public repo to document some SPR stuff

License:MITStargazers:693Issues:27Issues:0

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonLicense:MITStargazers:380Issues:19Issues:16

SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Language:PythonLicense:Apache-2.0Stargazers:346Issues:17Issues:7

images-that-sound

Official repo for Images that sound: a special spectrogram that can be seen as images and played as sound generated by diffusions

Language:PythonLicense:MITStargazers:190Issues:0Issues:0

CV-VAE

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Language:Jupyter NotebookStargazers:153Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:130Issues:3Issues:1

soundctm

Pytorch implementation of SoundCTM

Language:PythonLicense:MITStargazers:66Issues:2Issues:0

MIDI-LLM-tokenizer

Tools for converting .mid files into text for training large language models

Language:PythonLicense:MITStargazers:64Issues:3Issues:2

When-in-Rome

meta-corpus of and code library for the functional harmonic analysis of music

m2d

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:53Issues:3Issues:6
Stargazers:47Issues:0Issues:0

micro-musicgen

a new family of super small music generation models focusing on experimental music and latent space exploration capabilities

Language:PythonLicense:MITStargazers:27Issues:0Issues:0
Language:PythonStargazers:25Issues:0Issues:0

musical-word-embedding

Musical Word Embedding for Music Tagging and Retrieval [IEEE TASLP]

Language:Jupyter NotebookStargazers:20Issues:0Issues:0

ARCH

ARCH: Audio Representations benCHmark

Language:PythonLicense:NOASSERTIONStargazers:19Issues:2Issues:0

Synchformer

Efficient synchronization from sparse cues

Language:PythonLicense:MITStargazers:17Issues:2Issues:0

ss-mpe

Code for the paper "Toward Fully Self-Supervised Multi-Pitch Estimation".

Language:PythonLicense:MITStargazers:11Issues:0Issues:0

efficient-speech-codec

A lightweight efficient audio codec in 30MB with 30~170x compression ratio. Supports 16kHz mono speech audio.

Language:PythonLicense:MITStargazers:7Issues:5Issues:2
Language:Jupyter NotebookStargazers:5Issues:0Issues:0