KaiyangZhou / CoOp

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

zero-shot or fine-tune?

jingzhengli opened this issue · comments

  1. To my knowledge, CLIP can be directly used applied to zero-shot learning (i.e., unseen/novel classes).
    coop and cocoop don't appear to be zero-shot learning, but require fine-tuning. However, I don't see the detials about how to fine-tuning in paper. Am I misunderstand it? In the meantime, I would like to know how the CLIP is fine-tuned.
  2. I cannot understand the figure 1 in paper: why the performance of coop and cocoop can be compared to zero-shot learning.

Thanks for great work.
I understood