princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dropout / unsupervised training implementation ?

Natooz opened this issue Β· comments

Hello, πŸ‘‹

First thank you for the very good quality of your code and the support you gave to issues / PR.
For a research projet, I am fine-tuning models on a contrastive objective. I intend to use your unsupervised method.

Looking at the code of SimCSE, I noticed that in the forward pass all sequences $x$ and $x^+$ are passed through the same batch. But doing so will apply the same dropout mask to all of them.
My workaround is to perform two forward passes through the BERT encoder, but I still wondered, did you do the same or maybe am I missing something ?