luyug / Condenser

EMNLP 2021 - Pre-training architectures for dense retrieval

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CLS of which layers to use in Condenser? last layer CLS? sum of last four layers CLS?

mahdiabdollahpour opened this issue · comments

Hi
Thanks for the nice repo. After pretraining, Condenser has the same architecture as BERT (condenser heads are removed). Which CLS layers worked best for neural IR? last layer CLS? the sum of the last four layers CLS? ....

We fine-tune the last backbone layer's CLS which is the one passed to the head during pre-training.

Closing for now. Feel free to re-open if you have new questions.