wangyuxinwhy / uniem

unified embedding model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Do M3E support multilingual?

chaochaoSZ opened this issue · comments

🚀 The feature

Do M3E support multilingual?

sorry, M3E supports only Chinese and English

sorry, M3E supports only Chinese and English

Thanks for your response. If we want to continue fine-tuning for bilingual Chinese-English, how should we proceed based on M3E-base? Can you please give some advise?

you can finetune m3e like this or finetune it just like a normal sentence-transformers model.

from datasets import load_dataset

from uniem.finetuner import FineTuner

dataset = load_dataset('shibing624/nli_zh', 'STS-B')
finetuner = FineTuner.from_pretrained('moka-ai/m3e-base', dataset=dataset)
finetuner.run(epochs=3)

And you can refer to this tutorial