Hay Kim's repositories
flux
Official inference repo for FLUX.1 models
MimicMotion
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
CogVideo
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
AnyV2V
A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
GOT-OCR2.0-
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Cinemo
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
FollowYourEmoji
[Siggraph Asia 2024] Follow-Your-Emoji: This repo is the official implementation of "Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation"
MOFA-Video
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Monkey
【CVPR 2024】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
VEnhancer
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
UniAnimate
Code for Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
UniPortrait
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalizations
Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image (uncensored)
LivePortrait
Make one portrait alive!
ComfyUI-RefUNet
A set of nodes to use Reference UNets
SimpleTuner
A general fine-tuning kit geared toward Stable Diffusion 2.1, Stable Diffusion 3, DeepFloyd, and SDXL.
MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level MLLM on Your Phone
OpenDiT
OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference
segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
MagicClothing
Official implementation of Magic Clothing: Controllable Garment-Driven Image Synthesis
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
MotionBooth
The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"