Jiapeng Wang's starred repositories
LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
stable-diffusion-webui
Stable Diffusion web UI
diffusiondb
A large-scale text-to-image prompt gallery dataset based on Stable Diffusion
MagicBrush
[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
recognize-anything
Open-source and strong foundation image recognition models.
FindTheChatGPTer
ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平替们,包括文本大模型、多模态大模型等,为大家提供一些便利