Getting same results for n_gram=1,2,3 ?
priyamtejaswin opened this issue · comments
Hi folks,
Thanks for releasing the code, and for making API easy to use.
Changing the n_grams does not seem to change the scores -- I'm wondering if I'm doing something wrong.
I'm using the code provided in the README:
from moverscore_v2 import get_idf_dict, word_mover_score
from collections import defaultdict
idf_dict_hyp = get_idf_dict(translations) # idf_dict_hyp = defaultdict(lambda: 1.)
idf_dict_ref = get_idf_dict(references) # idf_dict_ref = defaultdict(lambda: 1.)
scores = word_mover_score(references, translations, idf_dict_ref, idf_dict_hyp, \
stop_words=[], n_gram=1, remove_subwords=True)
I get the same scores for 1
, 2
, and 3
as n_gram
values.
My dataset is the Gigawords summarization Dev set:
- 189K samples
- The "references" are the gold/target summaries
- The "translations" are the model generated summaries
Thanks a lot for your interest. In the moverscore_v2.py, n-gram matching and p-means are ignored by design for speed and simplicity. The full version is in moverscore.py, but it costs longer time to run.