[CVPR 23] Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool