Hay Kim's repositories
MiniGemini
Official implementation for Mini-Gemini
facefusion
Next generation face swapper and enhancer
Dough
Dough is a open source tool for steering AI animations with precision.
MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.
VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction"
InstantStyle
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
ConsistI2V
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
BrushNet
The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
img2img-turbo
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
Video-Motion-Customization
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models (CVPR 2024)
MoneyPrinterTurbo
利用大模型,一键生成短视频
AnyV2V
A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
interactdiffusion
[CVPR 2024] Official repo for "InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model".
AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Alibaba DAMO Academy.
TableStructureRec
整理目前开源的表格识别模型,完善前后处理,模型转换为ONNX
AnimateDiff
Official implementation of AnimateDiff.
DragNUWA
图像编辑
MiniCPM
MiniCPM-2B: An end-side LLM outperforms Llama2-13B.
OCR-SAM
Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting
PhotoMaker
PhotoMaker
Uformer
[CVPR 2022] Official implementation of the paper "Uformer: A General U-Shaped Transformer for Image Restoration".
Personalize-SAM
Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds
git_test
git 命令测试