Daniel Y.T. Kim's repositories

Adv-Diffusion

[AAAI-2024] Official code for work "Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model"

Language:PythonStargazers:0Issues:0Issues:0

AlignProp

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

cog-sdxl-clip-interrogator

Attempt at cog wrapper for a SDXL CLIP Interrogator

Stargazers:0Issues:0Issues:0

CoMat

Official code for ๐Ÿ’ซCoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

DGInStyle

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

License:Apache-2.0Stargazers:0Issues:0Issues:0

DragAPart

Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.

Stargazers:0Issues:0Issues:0

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Face2Diffusion

[CVPR 2024] Face2Diffusion for Fast and Editable Face Personalization https://arxiv.org/abs/2403.05094

License:NOASSERTIONStargazers:0Issues:0Issues:0

graph-of-thoughts

Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"

License:NOASSERTIONStargazers:0Issues:0Issues:0

HeadStudio

HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.

License:MITStargazers:0Issues:0Issues:0

imagetoprompt-ai

Turn your images into detailed and descriptive text prompts with AI

License:MITStargazers:0Issues:0Issues:0

langchain-kr

LangChain ๊ณต์‹ Document, Cookbook, ๊ทธ ๋ฐ–์˜ ์‹ค์šฉ ์˜ˆ์ œ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ž‘์„ฑํ•œ ํ•œ๊ตญ์–ด ํŠœํ† ๋ฆฌ์–ผ์ž…๋‹ˆ๋‹ค. ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์„ ํ†ตํ•ด LangChain์„ ๋” ์‰ฝ๊ณ  ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

License:Apache-2.0Stargazers:0Issues:0Issues:0

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

mm-cot

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)

License:Apache-2.0Stargazers:0Issues:0Issues:0

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

License:Apache-2.0Stargazers:0Issues:0Issues:0

NLLB-200-Distilled-350M-en-ko

nllb-200 distilled 350M for English to Korean translation

License:NOASSERTIONStargazers:0Issues:0Issues:0

OneTrainer

OneTrainer is a one-stop solution for all your stable diffusion training needs.

License:AGPL-3.0Stargazers:0Issues:0Issues:0

photoswap

Official implementation of the NeurIPS 2023 paper "Photoswap: Personalized Subject Swapping in Images"

License:MITStargazers:0Issues:0Issues:0

PVA-CelebAHQ-IDI

Parallel Visual Attention (WACV 2024) and CelebAHQ Identity-Preserving Inpainting dataset repository.

Stargazers:0Issues:0Issues:0

StableDiffusion-CheatSheet

A list of StableDiffusion styles and some notes for offline use. Pure HTML, CSS and a bit of JS.

License:MITStargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0

T-GATE

T-GATE: Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

License:MITStargazers:0Issues:0Issues:0