FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Home Page:https://groma-mllm.github.io/

Repository from Github https://github.comFoundationVision/GromaRepository from Github https://github.comFoundationVision/Groma

FoundationVision/Groma Stargazers