sail-sg / volo

VOLO: Vision Outlooker for Visual Recognition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What's the difference between `x_cls` and `x_aux`?

inkzk opened this issue · comments

commented

Congratulations for the SOTA!

x_cls and x_aux seems like NLP concepts.
How should I understand them when using VOLO as a face recognition network?
Which is the face representation feature vector?

Hi, x_cls is class token, x_aux are the output tokens of other patches(or feature tokens).
image