WANG XIN's starred repositories
Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
awesome-text-to-image-studies
A collection of awesome text-to-image generation studies.
awesome-video-generation
A collection of awesome video generation studies.
Awesome_Multimodel_LLM
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.
Efficient-Multimodal-LLMs-Survey
Efficient Multimodal Large Language Models: A Survey
brain-inspired-replay
A brain-inspired version of generative replay for continual learning with deep neural networks (e.g., class-incremental learning on CIFAR-100; PyTorch code).
Efficient_Foundation_Model_Survey
Survey Paper List - Efficient LLM and Foundation Models
Awesome-Object-Pose-Estimation
Project Page for Paper "Deep Learning-Based Object Pose Estimation: A Comprehensive Survey"
Count-Anything
This method uses Segment Anything and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
ShareGPT4V
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
KingMV.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes