luyug / Condenser

EMNLP 2021 - Pre-training architectures for dense retrieval

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Have you tried condenser pretraining on RoBERTa ?

1024er opened this issue · comments

I pretrained a condeser-roberta-base on the same data and hyperparameters, but the results on downstream tasks were not high.

Have you ever tried condenser pretraining on RoBERTa-base ?

Thank you

Same data no. I have trained with openwebtext (a open version of web text, part of Roberta training data) with a base architecture Roberta. It does better on sentence similarity task but not on retrieval tasks, when compared with Bert condenser. As a side note, we observed previously that vanilla Roberta base is typically inferior to vanilla Bert base on retrieval tasks.

We have just started test runs with condenser-roberta-large and therefore not much to say there yet.