uta-smile / TCL

code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How about the intra-modal retrieval performance?

xiaoxin83121 opened this issue · comments

Hi, Thanks for sharing of your great work, TCL.
I have read the paper, and get some confusion. What about the performance of intra-modal retrieval? Like img2img or text2text, it should be better than that reported in the ALIGN. However, I do not find experiments about that. Am I miss something?