Mark Ding's starred repositories
segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
PySceneDetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
mt-bench-101
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Make-An-Audio-3
Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers
Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
ICLR2024-FTIC
[ICLR2024] FTIC: Frequency-aware Transformer for Learned Image Compression
ECCV2024-AdpatICMH
[ECCV2024] Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation
harmonixset
The Harmonix Set: Beats, Downbeats, and Structural Annotations for Pop Music
all-in-one
All-In-One Music Structure Analyzer
hierarchical-structure-analysis
Algorithm and Data for paper "Automatic Detection of Hierarchical Structure and Influence of Structure on Melody, Harmony and Rhythm in Popular Music"
ACE_phonemes
a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine