dvlab-research / LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comparison with GroundedDINO

Yebulabula opened this issue · comments

Dear author,

Thank you for your significant contribution to the community; the performance of your model is truly impressive. I am curious to know if you have considered replacing the text-guided SAM with GroundedSAM (i.e., GroundedDINO + SAM). Specifically, I meant to finetune the GroundedDINO image-text feature fusion module rather than fine-tuning SAM's mask decoder. While I understand that GroundedDINO may increase computational costs, I am interested in whether this two-stage promptable segmentation pipeline could enhance segmentation results.

Best wises,
Ye