princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About model format conversion

fzxxg opened this issue · comments

I didn’t reproduce using your code, can I use your evaluation file? But I noticed that your model format needs to be converted. If I don’t convert it, it can still run successfully, but what impact will this have? The following is the code I executed for the evaluation file:
python evaluation.py
--model_name_or_path bert-base-uncased-sts
--pooler cls_before_pooler
--task_set sts/transfer
--mode test

I only saved the BERT parameters in the model.

I believe the output format is safetensors. There should be any actual impact.

Hi @fzxxg , if you use our training code and don't convert it, the checkpoint won't be able to perform as well as the paper reported; however, if you are evaluating other models, it should be fine (as long as the inference for those models are strictly the same as how they are supposed to be used.

Thank you for your reply. I will close this issue.