yearnyeen ho's starred repositories

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

open-interpreter

A natural language interface for computers

Language:PythonLicense:AGPL-3.0Stargazers:51662Issues:384Issues:911

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:22764Issues:223Issues:129

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:7530Issues:109Issues:152

umap

Uniform Manifold Approximation and Projection

Language:PythonLicense:BSD-3-ClauseStargazers:7311Issues:127Issues:785

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4400Issues:58Issues:149

torchinfo

View model summaries in PyTorch!

Language:PythonLicense:MITStargazers:2461Issues:18Issues:155

visualization-curriculum

A data visualization curriculum of interactive notebooks.

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1275Issues:54Issues:13

Hybrid-Net

Real-time audio to chords, lyrics, beat, and melody.

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

StreamMultiDiffusion

Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."

Language:Jupyter NotebookLicense:MITStargazers:516Issues:10Issues:14

edm2

Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)

Language:PythonLicense:NOASSERTIONStargazers:461Issues:12Issues:5

awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

ML-from-scratch-seminar

This repository is part of a "Machine Learning from Scratch" seminar at Harvard Medical School.

Language:Jupyter NotebookLicense:MITStargazers:257Issues:22Issues:0

ect

Consistency Models Made Easy

gflownet

Generative Flow Networks - GFlowNet

Language:PythonLicense:Apache-2.0Stargazers:152Issues:7Issues:47

DiffusionRet

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

Language:PythonLicense:Apache-2.0Stargazers:107Issues:3Issues:9

EnCLAP

Official Implementation of EnCLAP (ICASSP 2024)

Language:PythonLicense:MITStargazers:89Issues:6Issues:9

Rank-N-Contrast

[NeurIPS 2023, Spotlight] Rank-N-Contrast: Learning Continuous Representations for Regression

pflow-encodec

Implementation of TTS model based on NVIDIA P-Flow TTS Paper

mini_edm

Minimum implementation of EDM (Elucidating the Design Space of Diffusion-Based Generative Models) on cifar10 and mnist

timbre-trap

Code for the paper "Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription"

Language:PythonLicense:MITStargazers:32Issues:2Issues:0

MWAFM

Multi-Scale Attention for Audio Question Answering

Cacophony

Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986

Language:PythonLicense:MITStargazers:24Issues:4Issues:2

music-text-representation-pp

Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR++) [ICASSP24]

real-time-lyrics-alignment

Codebase for 'A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance', ICASSP 2024

Language:PythonLicense:NOASSERTIONStargazers:11Issues:1Issues:0

audio-representations

JEPAs for audio representation learning

ICASSP-2024-BEAFX-using-DDSP

Github repository for the paper accepted in ICASSP 2024 : Blind estimation of audio effects using an auto-encoder approach and differentiable signal processing

Language:Jupyter NotebookStargazers:10Issues:1Issues:0

Call-Response

Responding to the Call: Exploring Automatic Music Composition Using a Knowledge-Enhanced Model

Language:PythonStargazers:5Issues:1Issues:0