AIPHES / emnlp19-moverscore

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can use this lib in Chinese?

hueiyuan opened this issue · comments

I want to check something. I have viewed source code and found it use DistillBert which use "distilbert-base-uncased".
I want to ask this lib if can be used in chinese language? Thanks

Thanks a lot for your interest! Yes, you can specify Chinese BERT (e.g., bert-base-chinese) as the model_name. Note that this project is designed for measuring the similarity of monolingual texts. If you are of interest in multilingual texts (e.g., the similarity between Chinese and English texts), please refer to our recent project in https://github.com/AIPHES/ACL20-Reference-Free-MT-Evaluation, where we made some modification to get better results in the multilingual evaluation context.

@andyweizhao
Understood! but how to specify Chinese BERT (e.g., bert-base-chinese) as the model_name with this lib?
I have not seen this parameter setting in the source code. Thanks for help.

@hueiyuan It is now specified in the readme:

import os 
os.environ['MOVERSCORE_MODEL'] = "albert-base-v2"

from moverscore_v2 import get_idf_dict
idf_dict_hyp = get_idf_dict(translations)

Here the model would be the bert model you want to use.