princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

 Why add two sentences in prepare_features?

FinalFlowers opened this issue · comments

https://github.com/princeton-nlp/SimCSE/blob/main/train.py#L419C13-L419C13
If add sent0 and sent1 together, then max_length=32 will produce more truncation in the following sentences.

Hi, here examples[sent0_cname] and examples[sent1_cname] are both lists, so it is a concatenation of two lists instead of two strings.

Hi, here examples[sent0_cname] and examples[sent1_cname] are both lists, so it is a concatenation of two lists instead of two strings.

Get~ Thank you~