Beast code in Giters

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

Language:PythonBSD-3-Clause166200

SceneGraphGenZeroShotWithGSAM

Scene Graph Generate Zero Shot

Language:Jupyter Notebook1700

torch-LLM4SGG

Official PyTorch implementation Source code for LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation, accepted at CVPR 2024

Language:Python7700

docker-prompt-generator

Using a Model to generate prompts for Model applications. / 使用模型来生成作图咒语的偷懒工具，支持 MidJourney、Stable Diffusion 等。

Language:PythonMIT116000

docker-stable-diffusion-xl-turbo

Stable Diffusion XL Turbo 实时文生图、图生图

Language:PythonApache-2.01200

instruct-pix2pix

Language:PythonNOASSERTION626800

MVDream

Multi-view Diffusion for 3D Generation

Language:PythonMIT77200

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Language:PythonMIT177700

llmblueprint

[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"

Language:Jupyter Notebook6500

T2I-Adapter

Language:Python340900

GraphDreamer

[CVPR'24] GraphDreamer: a novel framework of generating compositional 3D scenes from scene graphs.

Language:PythonMIT16000

PIA

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA，你的个性化图像动画生成器，利用文本提示将图像变为奇妙的动画

Language:PythonApache-2.088200

Mega

Code for ACM MM 2021 Paper "Multimodal Relation Extraction with Efficient Graph Alignment".

Language:PythonMIT8800

VISTA

VISTA: VIsual-Textual Knowledge Graph Representation Learning (Findings of EMNLP 2023)

Language:Python1900

MNRE

Resource and Code for ICME 2021 paper "MNRE: A Challenge Multimodal Dataset for Neural Relation Extraction with Visual Evidence in Social Media Posts"

4800

DSG

Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)

Language:Jupyter Notebook7400

math

13300