princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to train your model to better fit SentEval

ZBWpro opened this issue · comments

commented

Hi~

SentEval requires users to implement a function called "batcher(params, batch)".

When it comes to STS tasks, the parameter "batch" of the batcher function only contains one sentence of a certain sentence pair.

This may cause conflicts, as your model typically requires the entire sentence pair as input.

If you convert sentences of a pair to embeddings one by one, it means you will need to call forward twice. This behavior is poorly supported by DDP.

Stale issue message

Hi,

Sorry about the late replyl. I'm not sure I totally follow. SentEval is only used for eval, thus it shouldn't affect DDP.

commented

Thanks for your reply, I found that I could solve this problem by introducing an extra encoding function into the class and calling it several times in forward.