DirtyHarryLYL / LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

Add these papers please

Johnx69 opened this issue a month ago · comments

Anh Dao commented a month ago

Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Dej´a Vu Memorization in Vision-Language Models
Red Teaming Visual Language Models
VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation