How to obtain cls_emb ?

Question

How to obtain cls_emb ?

haozhi1817 opened this issue 6 months ago · comments

cls_emb is a learnable parameter， and we compute Q, K, V by nn.Linear inTransFormerBlock, which means a part of Q（or K or V）are derived from two learnable parameters cls_emb(a learnable parameter) and W (a learnable parameter in nn.Linear), I think it is not easy to obtain a reasonable cls_emb and a reasonable nn.Linear.weight through gradient descent at the same time.

rstrudel · Answer 1 · Fri Dec 01 2023 00:15:03 GMT+0800 (China Standard Time)

The use of a CLS token is very standard. It's a way to aggregate and pool global information into a single token. You can check BERT and the vision transformer paper for instance :)

hqhz1817 · Answer 2 · Fri Dec 01 2023 09:34:23 GMT+0800 (China Standard Time)

The use of a CLS token is very standard. It's a way to aggregate and pool global information into a single token. You can check BERT and the vision transformer paper for instance :)

Thank you for your response. When I was reading the code of DETR and MaskFormer, I noticed that they set tgt (which is the same as cls_emb) as the query. I thought that Q was equal to tgt, rather than Q being equal to a learnable weight multiplied by tgt. However, I now realize that I was mistaken. Once again, thank you.