497662892

Ma Jiajian's starred repositories

awesome-ai-residency

List of AI Residency Programs

motion-latent-diffusion

[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model

Language:PythonMIT55500

Latte

Latte: Latent Diffusion Transformer for Video Generation.

Language:PythonApache-2.0155800

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonApache-2.0186400

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonNOASSERTION580900

Magic-Me

Codes for ID-Specific Video Customized Diffusion

Language:PythonApache-2.044400

SEINE

[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Language:PythonApache-2.087100

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonApache-2.0657300

edm

Elucidating the Design Space of Diffusion-Based Generative Models (EDM)

Language:PythonNOASSERTION123200

FreeInit

[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models

Language:PythonMIT46200

text2cinemagraph

Text2Cinemagraph: Text-Guided Synthesis of Eulerian Cinemagraphs [SIGGRAPH ASIA 2023]

Language:PythonMIT35900

Video-Motion-Customization

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models (CVPR 2024)

Language:PythonApache-2.015400

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Language:Python281800

PIA

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA，你的个性化图像动画生成器，利用文本提示将图像变为奇妙的动画

Language:PythonApache-2.084200

LivePhoto

Official implementations for paper: LivePhoto: Real Image Animation with Text-guided Motion Control

MIT17000

PhotoMaker

PhotoMaker [CVPR 2024]

Language:Jupyter NotebookNOASSERTION905800

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

Language:PythonMIT386300

insightface

State-of-the-art 2D and 3D Face Analysis Project

Language:Python2223900

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonApache-2.01065000

facechain

FaceChain is a deep-learning toolchain for generating your Digital-Twin.

Language:Jupyter NotebookApache-2.0879100

I2V-Adapter-repo

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models

18300

LAMP

Official implement code of LAMP: Learn a Motion Pattern by Few-Shot Tuning a Text-to-Image Diffusion Model (Few-shot-based text-to-video diffusion)

Language:PythonNOASSERTION24500

MIS-FM

Language:PythonApache-2.022200

MedSAM

Segment Anything in Medical Images

Language:Jupyter NotebookApache-2.0252900

WSI-HGNN

[CVPR'23] Histopathology Whole Slide Image Analysis with Heterogeneous Graph Representation Learning

Language:Python6300

EndoGS

EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting

Language:Python9400

SAM-Med3D

SAM-Med3D: An Efficient General-purpose Promptable Segmentation Model for 3D Volumetric Medical Image

Language:PythonApache-2.043300

dsmil-wsi

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image

Language:PythonMIT34000

prompt-to-prompt

Language:Jupyter NotebookApache-2.0299800

XMem

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

Language:PythonMIT168400