zero-shot or fine-tune?
jingzhengli opened this issue · comments
Jingzheng Li commented
- To my knowledge, CLIP can be directly used applied to zero-shot learning (i.e., unseen/novel classes).
coop and cocoop don't appear to be zero-shot learning, but require fine-tuning. However, I don't see the detials about how to fine-tuning in paper. Am I misunderstand it? In the meantime, I would like to know how the CLIP is fine-tuned. - I cannot understand the figure 1 in paper: why the performance of coop and cocoop can be compared to zero-shot learning.
Jingzheng Li commented
Thanks for great work.
I understood