Daniel Y.T. Kim's repositories
Adv-Diffusion
[AAAI-2024] Official code for work "Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model"
AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
cog-sdxl-clip-interrogator
Attempt at cog wrapper for a SDXL CLIP Interrogator
CoMat
Official code for ๐ซCoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
DGInStyle
DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control
DragAPart
Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.
ELLA
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Face2Diffusion
[CVPR 2024] Face2Diffusion for Fast and Editable Face Personalization https://arxiv.org/abs/2403.05094
graph-of-thoughts
Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
HeadStudio
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.
imagetoprompt-ai
Turn your images into detailed and descriptive text prompts with AI
langchain-kr
LangChain ๊ณต์ Document, Cookbook, ๊ทธ ๋ฐ์ ์ค์ฉ ์์ ๋ฅผ ๋ฐํ์ผ๋ก ์์ฑํ ํ๊ตญ์ด ํํ ๋ฆฌ์ผ์ ๋๋ค. ๋ณธ ํํ ๋ฆฌ์ผ์ ํตํด LangChain์ ๋ ์ฝ๊ณ ํจ๊ณผ์ ์ผ๋ก ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ์ ๋ฐฐ์ธ ์ ์์ต๋๋ค.
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
mm-cot
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
NLLB-200-Distilled-350M-en-ko
nllb-200 distilled 350M for English to Korean translation
OneTrainer
OneTrainer is a one-stop solution for all your stable diffusion training needs.
photoswap
Official implementation of the NeurIPS 2023 paper "Photoswap: Personalized Subject Swapping in Images"
PVA-CelebAHQ-IDI
Parallel Visual Attention (WACV 2024) and CelebAHQ Identity-Preserving Inpainting dataset repository.
StableDiffusion-CheatSheet
A list of StableDiffusion styles and some notes for offline use. Pure HTML, CSS and a bit of JS.
T-GATE
T-GATE: Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models
VAST
Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset