What hidden states should I use?
zhangzhenyu13 opened this issue · comments
the output of encoder is N*d states for each text input;
- simcse use the 1st one as the output.
- the sentence-transformers use the mean pooling(only for attention mask=1 states).
Your default is the same as mean pooling?
yes, use the mean pooling(only for attention mask=1 states)