What is exactly is CONdition of Condensor?
hmthanh opened this issue · comments
As title, I already read paper but still confused about the Condition of CLS token.
What is exactly is CONdition of Condensor?
If you take a closer look at the design of the head Transformer layers, the MLM prediction has an explicit conditioning/dependency on the CLS vector from the backbone LM.