yearnyeen ho's starred repositories
visualization-curriculum
A data visualization curriculum of interactive notebooks.
StreamMultiDiffusion
Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."
awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
real-time-lyrics-alignment
Codebase for 'A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance', ICASSP 2024
timbre-trap
Code for the paper "Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription"
pflow-encodec
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
audio-representations
JEPAs for audio representation learning
VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
Call-Response
Responding to the Call: Exploring Automatic Music Composition Using a Knowledge-Enhanced Model
music-text-representation-pp
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR++) [ICASSP24]
open-interpreter
A natural language interface for computers
ML-from-scratch-seminar
This repository is part of a "Machine Learning from Scratch" seminar at Harvard Medical School.
ICASSP-2024-BEAFX-using-DDSP
Github repository for the paper accepted in ICASSP 2024 : Blind estimation of audio effects using an auto-encoder approach and differentiable signal processing
Rank-N-Contrast
[NeurIPS 2023, Spotlight] Rank-N-Contrast: Learning Continuous Representations for Regression
DiffusionRet
[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Hybrid-Net
Real-time audio source separation, generate lyrics, chords, beat.