Can you tell me which token represents the overall representation of the sentence in the task of feature-extraction? The first token or the last token?
junzai0215 opened this issue · comments
The implementation of DeBERTa
junzai0215 opened this issue · comments