2132660698's repositories
autogen
Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
Chinese-LLaVA
支持中英文双语视觉-文本对话的开源可商用多模态模型。
CLIP-VG
CLIP for Visual Grounding
CogVLM
a state-of-the-art-level open visual language model
COMM
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
EasyMocap
Make human motion capture easier.
Eureka
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models"
freemocap
Free Motion Capture for Everyone 💀✨
GPT-4V-Act
AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI
Hallucination-Correction-for-MLLMs
✨✨The first work to correct hallucination in multimodal large language models.
human-motion-capture
collect papers about human motion capture
Informer2020
The GitHub repository for the paper "Informer" accepted by AAAI 2021.
Lion
Lion: Kindling Vision Intelligence within Large Language Models
LRV-Instruction
Aligning Large Multi-Modal Model with Robust Instruction Tuning
MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
MoE_demo
A MoE_demo using pytorch
Pink
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
PVIT
Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
SEEChat
Multimodal chatbot with computer vision capabilities integrated
t2motion
Official implementation of Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions. (ICCV2023)