Mark Ding (Mark12Ding)

Mark12Ding

Geek Repo

Company:The Chinese University of Hong Kong

Home Page:https://mark12ding.github.io/

Twitter:@ShuangruiDing

Github PK Tool:Github PK Tool

Mark Ding's starred repositories

segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7910Issues:0Issues:0

PointLLM

[ECCV 2024] PointLLM: Empowering Large Language Models to Understand Point Clouds

Language:PythonStargazers:461Issues:0Issues:0

EDGE

Official PyTorch Implementation of EDGE (CVPR 2023)

Language:PythonLicense:MITStargazers:422Issues:0Issues:0

Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

Language:PythonLicense:Apache-2.0Stargazers:3002Issues:0Issues:0

PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.

Language:PythonLicense:BSD-3-ClauseStargazers:3037Issues:0Issues:0

mt-bench-101

[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues

License:Apache-2.0Stargazers:29Issues:0Issues:0

POPDG

[CVPR 2024] POPDG: Popular 3D Dance Generation with PopDanceSet

Language:PythonLicense:MITStargazers:26Issues:0Issues:0

MotionLCM

[ ECCV 2024 ] MotionLCM: This repo is the official implementation of "MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model"

Language:PythonLicense:NOASSERTIONStargazers:194Issues:0Issues:0

Make-An-Audio-3

Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers

Language:PythonStargazers:58Issues:0Issues:0

Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Language:PythonLicense:MITStargazers:727Issues:0Issues:0

awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

License:MITStargazers:266Issues:0Issues:0

Melodist

Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment

Stargazers:1Issues:0Issues:0

ICLR2024-FTIC

[ICLR2024] FTIC: Frequency-aware Transformer for Learned Image Compression

Language:PythonStargazers:28Issues:0Issues:0

ECCV2024-AdpatICMH

[ECCV2024] Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation

Stargazers:16Issues:0Issues:0

GTA-Seg

Code for GTA-Seg (NeurIPS2022)

Language:PythonLicense:Apache-2.0Stargazers:37Issues:0Issues:0

d3fields

[arXiv] D^3Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Robotic Manipulation

Language:PythonLicense:MITStargazers:103Issues:0Issues:0

nxtp

Object Recognition as Next Token Prediction (CVPR 2024)

Language:PythonLicense:NOASSERTIONStargazers:146Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:483Issues:0Issues:0
Stargazers:22Issues:0Issues:0

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonLicense:Apache-2.0Stargazers:1909Issues:0Issues:0

FedScale

FedScale is a scalable and extensible open-source federated learning (FL) platform.

Language:PythonLicense:Apache-2.0Stargazers:383Issues:0Issues:0

SOFT

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity

Language:PythonLicense:MITStargazers:300Issues:0Issues:0

madmom

Python audio and music signal processing library

Language:PythonLicense:NOASSERTIONStargazers:1287Issues:0Issues:0

harmonixset

The Harmonix Set: Beats, Downbeats, and Structural Annotations for Pop Music

Language:Jupyter NotebookLicense:MITStargazers:143Issues:0Issues:0

all-in-one

All-In-One Music Structure Analyzer

Language:PythonLicense:MITStargazers:386Issues:0Issues:0

Long-CLIP

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Language:PythonLicense:Apache-2.0Stargazers:509Issues:0Issues:0

hierarchical-structure-analysis

Algorithm and Data for paper "Automatic Detection of Hierarchical Structure and Influence of Structure on Melody, Harmony and Rhythm in Popular Music"

Language:PythonLicense:MITStargazers:86Issues:0Issues:0

GeoWizard

[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Language:PythonStargazers:658Issues:0Issues:0

ACE_phonemes

a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine

Language:PythonLicense:MITStargazers:30Issues:0Issues:0

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1276Issues:0Issues:0