DirtyHarryLYL / LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Please add these paper

Johnx69 opened this issue · comments

  1. AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
  2. VITRON: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
  3. Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
  4. GLaMM: Pixel Grounding Large Multimodal Model
  5. Planting a SEED of Vision in Large Language Model

Thanks. Done.