get_easse_report_from_exp_dir Failed
pelican9 opened this issue · comments
Hello, I am running train_mode.py but get_easse_report_from_exp_dir fail. This is the error
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-15-5dc913ca0d14> in <module>()
----> 1 result = fairseq_train_and_evaluate_with_parametrization(**kwargs)
13 frames
/content/drive/MyDrive/muss/muss/fairseq/main.py in fairseq_train_and_evaluate_with_parametrization(dataset, **kwargs)
228 kwargs['preprocessor_kwargs'] = recommended_preprocessors_kwargs
229 # Evaluation
--> 230 scores = print_running_time(fairseq_evaluate_and_save)(exp_dir, **kwargs)
231 score = combine_metrics(scores['bleu'], scores['sari'], scores['fkgl'], kwargs.get('metrics_coefs', [0, 1, 0]))
232 # TODO: This is a redundant hack with what happens in fairseq_evaluate_and_save (predict_files and evaluate_kwargs), it should be fixed
/content/drive/MyDrive/muss/muss/utils/helpers.py in wrapped_func(*args, **kwargs)
468 function_name = getattr(func, '__name__', repr(func))
469 with log_action(function_name):
--> 470 return func(*args, **kwargs)
471
472 return wrapped_func
/content/drive/MyDrive/muss/muss/fairseq/main.py in fairseq_evaluate_and_save(exp_dir, **kwargs)
104 print(f'scores={scores}')
105 report_path = exp_dir / 'easse_report.html'
--> 106 shutil.move(get_easse_report_from_exp_dir(exp_dir, **kwargs), report_path)
107 print(f'report_path={report_path}')
108 predict_files = kwargs.get(
/content/drive/MyDrive/muss/muss/fairseq/main.py in get_easse_report_from_exp_dir(exp_dir, **kwargs)
97 def get_easse_report_from_exp_dir(exp_dir, **kwargs):
98 simplifier = fairseq_get_simplifier(exp_dir, **kwargs)
---> 99 return get_easse_report(simplifier, **kwargs.get('evaluate_kwargs', {'test_set': 'asset_valid'}))
100
101
/content/drive/MyDrive/muss/muss/evaluation/general.py in get_easse_report(simplifier, test_set, orig_sents_path, refs_sents_paths)
40 orig_sents_path=orig_sents_path,
41 refs_sents_paths=refs_sents_paths,
---> 42 report_path=report_path,
43 )
44 return report_path
/usr/local/lib/python3.7/dist-packages/easse/cli.py in report(test_set, sys_sents_path, orig_sents_path, refs_sents_paths, report_path, tokenizer, lowercase, metrics)
302 lowercase=lowercase,
303 tokenizer=tokenizer,
--> 304 metrics=metrics,
305 )
306
/usr/local/lib/python3.7/dist-packages/easse/report.py in write_html_report(filepath, *args, **kwargs)
477 def write_html_report(filepath, *args, **kwargs):
478 with open(filepath, 'w') as f:
--> 479 f.write(get_html_report(*args, **kwargs) + '\n')
480
481
/usr/local/lib/python3.7/dist-packages/easse/report.py in get_html_report(orig_sents, sys_sents, refs_sents, test_set, lowercase, tokenizer, metrics)
471 doc.stag('hr')
472 with doc.tag('div', klass='container-fluid'):
--> 473 doc.asis(get_qualitative_examples_html(orig_sents, sys_sents, refs_sents))
474 return indent(doc.getvalue())
475
/usr/local/lib/python3.7/dist-packages/easse/report.py in get_qualitative_examples_html(orig_sents, sys_sents, refs_sents)
154 sample_generator = sorted(
155 zip(orig_sents, sys_sents, zip(*refs_sents)),
--> 156 key=lambda args: sort_key(*args),
157 )
158 # Samples displayed by default
/usr/local/lib/python3.7/dist-packages/easse/report.py in <lambda>(args)
154 sample_generator = sorted(
155 zip(orig_sents, sys_sents, zip(*refs_sents)),
--> 156 key=lambda args: sort_key(*args),
157 )
158 # Samples displayed by default
/usr/local/lib/python3.7/dist-packages/easse/report.py in <lambda>(c, s, refs)
91 (
92 'Best simplifications according to SARI',
---> 93 lambda c, s, refs: -corpus_sari([c], [s], [refs]),
94 lambda value: f'SARI={-value:.2f}',
95 ),
/usr/local/lib/python3.7/dist-packages/easse/sari.py in corpus_sari(*args, **kwargs)
264
265 def corpus_sari(*args, **kwargs):
--> 266 add_score, keep_score, del_score = get_corpus_sari_operation_scores(*args, **kwargs)
267 return (add_score + keep_score + del_score) / 3
/usr/local/lib/python3.7/dist-packages/easse/sari.py in get_corpus_sari_operation_scores(orig_sents, sys_sents, refs_sents, lowercase, tokenizer, legacy, use_f1_for_deletion, use_paper_version)
254 refs_sents = [[utils_prep.normalize(sent, lowercase, tokenizer) for sent in ref_sents] for ref_sents in refs_sents]
255
--> 256 stats = compute_ngram_stats(orig_sents, sys_sents, refs_sents)
257
258 if not use_paper_version:
/usr/local/lib/python3.7/dist-packages/easse/sari.py in compute_ngram_stats(orig_sents, sys_sents, refs_sents)
110 assert all(
111 len(ref_sents) == len(orig_sents) for ref_sents in refs_sents
--> 112 ), "Reference sentences don't have the shape (n_references, n_samples)"
113 add_sys_correct = [0] * NGRAM_ORDER
114 add_sys_total = [0] * NGRAM_ORDER
AssertionError: Reference sentences don't have the shape (n_references, n_samples)
I printed out where the error occurs and it showed that
len(refs_sents)=1
len(ref_sents)=10
len(orig_sents)=1
which I suppose should be like this?
len(refs_sents)=10
len(ref_sents)=1
len(orig_sents)=1
I am not sure how to make this change happen without impacting the outcome of the code. I'll appreciate any advice. Thank you in advance!
Thanks for raising this issue!
I'm working on this bug, in the meantime maybe you can just comment out this line: shutil.move(get_easse_report_from_exp_dir(exp_dir, **kwargs), report_path)
Just pushed a fix in easse.
Can you please reinstall easse? pip uninstall easse && pip install -r requirements.txt
Hi Martin, thank you for your quick response!
I reinstalled easse and tried to get easse report using muss_en_wikilarge_mined. However, the following error occurred. I have ran torch.cuda.empty_cache() before running get_easse_report_from_exp_dir. Do you have any idea why this happening? Thank you in advance.
(I tried simply generating simplified sentences using get_fairseq_simplifier() that uses my local asset/valid.complex with the same GPU and it works)
INFO:fairseq_cli.generate:loading model(s) from /content/drive/MyDrive/ori_muss/muss/resources/models/bart_mined_wikilarge/model.pt
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-19-f8044b1e83b1> in <module>()
2 report_path = exp_dir / 'easse_report.html'
3
----> 4 shutil.move(get_easse_report_from_exp_dir(exp_dir, kwargs=preprocessors_kwargs), report_path)
5 print(f'report_path={report_path}')
28 frames
/content/drive/My Drive/ori_muss/muss/muss/fairseq/main.py in get_easse_report_from_exp_dir(exp_dir, **kwargs)
97 def get_easse_report_from_exp_dir(exp_dir, **kwargs):
98 simplifier = fairseq_get_simplifier(exp_dir, **kwargs)
---> 99 return get_easse_report(simplifier, **kwargs.get('evaluate_kwargs', {'test_set': 'asset_valid'}))
100
101
/content/drive/My Drive/ori_muss/muss/muss/evaluation/general.py in get_easse_report(simplifier, test_set, orig_sents_path, refs_sents_paths)
33 orig_sents_path = get_temp_filepath()
34 write_lines(orig_sents, orig_sents_path)
---> 35 sys_sents_path = simplifier(orig_sents_path)
36 report_path = get_temp_filepath()
37 report(
/content/drive/My Drive/ori_muss/muss/muss/simplifiers.py in wrapped(complex_filepath, pred_filepath)
40 if pred_filepath is None:
41 pred_filepath = get_temp_filepath()
---> 42 simplifier(complex_filepath, pred_filepath)
43 return pred_filepath
44
/content/drive/My Drive/ori_muss/muss/muss/simplifiers.py in wrapped(complex_filepath, pred_filepath)
28 shutil.copyfile(previous_pred_filepath, pred_filepath)
29 else:
---> 30 simplifier(complex_filepath, pred_filepath)
31 # Save prediction
32 memo[complex_filehash] = pred_filepath
/content/drive/My Drive/ori_muss/muss/muss/simplifiers.py in preprocessed_simplifier(complex_filepath, pred_filepath)
66 preprocessed_complex_filepath = get_temp_filepath()
67 composed_preprocessor.encode_file(complex_filepath, preprocessed_complex_filepath)
---> 68 preprocessed_pred_filepath = simplifier(preprocessed_complex_filepath)
69 composed_preprocessor.decode_file(preprocessed_pred_filepath, pred_filepath, encoder_filepath=complex_filepath)
70
/content/drive/My Drive/ori_muss/muss/muss/simplifiers.py in wrapped(complex_filepath, pred_filepath)
40 if pred_filepath is None:
41 pred_filepath = get_temp_filepath()
---> 42 simplifier(complex_filepath, pred_filepath)
43 return pred_filepath
44
/content/drive/My Drive/ori_muss/muss/muss/simplifiers.py in wrapped(complex_filepath, pred_filepath)
28 shutil.copyfile(previous_pred_filepath, pred_filepath)
29 else:
---> 30 simplifier(complex_filepath, pred_filepath)
31 # Save prediction
32 memo[complex_filehash] = pred_filepath
/content/drive/My Drive/ori_muss/muss/muss/simplifiers.py in fairseq_simplifier(complex_filepath, output_pred_filepath)
52 @memoize_simplifier
53 def fairseq_simplifier(complex_filepath, output_pred_filepath):
---> 54 fairseq_generate(complex_filepath, output_pred_filepath, exp_dir, **kwargs)
55
56 return fairseq_simplifier
/content/drive/My Drive/ori_muss/muss/muss/fairseq/base.py in fairseq_generate(complex_filepath, output_pred_filepath, exp_dir, beam, hypothesis_num, lenpen, diverse_beam_groups, diverse_beam_strength, sampling, max_tokens, source_lang, target_lang, **kwargs)
233 sampling=sampling,
234 max_tokens=max_tokens,
--> 235 **kwargs,
236 )
/content/drive/My Drive/ori_muss/muss/muss/utils/training.py in wrapped_func(*args, **kwargs)
58 def wrapped_func(*args, **kwargs):
59 try:
---> 60 return func(*args, **kwargs)
61 finally:
62 torch.cuda.empty_cache()
/content/drive/My Drive/ori_muss/muss/muss/fairseq/base.py in _fairseq_generate(complex_filepath, output_pred_filepath, checkpoint_paths, complex_dictionary_path, simple_dictionary_path, beam, hypothesis_num, lenpen, diverse_beam_groups, diverse_beam_strength, sampling, max_tokens, source_lang, target_lang, **kwargs)
186 args = shlex.split(args)
187 with mock_cli_args(args):
--> 188 generate.cli_main()
189
190 all_hypotheses = fairseq_parse_all_hypotheses(out_filepath)
/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py in cli_main()
377 parser = options.get_generation_parser()
378 args = options.parse_args_and_arch(parser)
--> 379 main(args)
380
381
/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py in main(args)
39 return _main(args, h)
40 else:
---> 41 return _main(args, sys.stdout)
42
43
/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py in _main(args, output_file)
194 sample,
195 prefix_tokens=prefix_tokens,
--> 196 constraints=constraints,
197 )
198 num_generated_tokens = sum(len(h[0]["tokens"]) for h in hypos)
/usr/local/lib/python3.7/dist-packages/fairseq/tasks/fairseq_task.py in inference_step(self, generator, models, sample, prefix_tokens, constraints)
432 with torch.no_grad():
433 return generator.generate(
--> 434 models, sample, prefix_tokens=prefix_tokens, constraints=constraints
435 )
436
/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
26 def decorate_context(*args, **kwargs):
27 with self.__class__():
---> 28 return func(*args, **kwargs)
29 return cast(F, decorate_context)
30
/usr/local/lib/python3.7/dist-packages/fairseq/sequence_generator.py in generate(self, models, sample, **kwargs)
175 (default: self.eos)
176 """
--> 177 return self._generate(sample, **kwargs)
178
179 def _generate(
/usr/local/lib/python3.7/dist-packages/fairseq/sequence_generator.py in _generate(self, sample, prefix_tokens, constraints, bos_token)
314 encoder_outs,
315 incremental_states,
--> 316 self.temperature,
317 )
318
/usr/local/lib/python3.7/dist-packages/fairseq/sequence_generator.py in forward_decoder(self, tokens, encoder_outs, incremental_states, temperature)
825 tokens,
826 encoder_out=encoder_out,
--> 827 incremental_state=incremental_states[i],
828 )
829 else:
/usr/local/lib/python3.7/dist-packages/fairseq/models/transformer.py in forward(self, prev_output_tokens, encoder_out, incremental_state, features_only, full_context_alignment, alignment_layer, alignment_heads, src_lengths, return_all_hiddens)
689 full_context_alignment=full_context_alignment,
690 alignment_layer=alignment_layer,
--> 691 alignment_heads=alignment_heads,
692 )
693 if not features_only:
/usr/local/lib/python3.7/dist-packages/fairseq/models/transformer.py in extract_features(self, prev_output_tokens, encoder_out, incremental_state, full_context_alignment, alignment_layer, alignment_heads)
710 full_context_alignment,
711 alignment_layer,
--> 712 alignment_heads,
713 )
714
/usr/local/lib/python3.7/dist-packages/fairseq/models/transformer.py in extract_features_scriptable(self, prev_output_tokens, encoder_out, incremental_state, full_context_alignment, alignment_layer, alignment_heads)
805 self_attn_padding_mask=self_attn_padding_mask,
806 need_attn=bool((idx == alignment_layer)),
--> 807 need_head_weights=bool((idx == alignment_layer)),
808 )
809 inner_states.append(x)
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/fairseq/modules/transformer_layer.py in forward(self, x, encoder_out, encoder_padding_mask, incremental_state, prev_self_attn_state, prev_attn_state, self_attn_mask, self_attn_padding_mask, need_attn, need_head_weights)
380 static_kv=True,
381 need_weights=need_attn or (not self.training and self.need_attn),
--> 382 need_head_weights=need_head_weights,
383 )
384 x = self.dropout_module(x)
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/fairseq/modules/multihead_attention.py in forward(self, query, key, value, key_padding_mask, incremental_state, need_weights, static_kv, attn_mask, before_softmax, need_head_weights)
210 else:
211 k = self.k_proj(key)
--> 212 v = self.v_proj(key)
213
214 else:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/linear.py in forward(self, input)
94
95 def forward(self, input: Tensor) -> Tensor:
---> 96 return F.linear(input, self.weight, self.bias)
97
98 def extra_repr(self) -> str:
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
1845 if has_torch_function_variadic(input, weight):
1846 return handle_torch_function(linear, (input, weight), input, weight, bias=bias)
-> 1847 return torch._C._nn.linear(input, weight, bias)
1848
1849
RuntimeError: CUDA out of memory. Tried to allocate 146.00 MiB (GPU 0; 15.90 GiB total capacity; 14.39 GiB already allocated; 47.75 MiB free; 14.96 GiB reserved in total by PyTorch)
Hi @pelican9 I'm not sure how to solve this issue.
Please keep me updated if you have any updates, it might help other people!
Best,
Louis
sure! I will keep you updated :)
@louismartin Hi Martin it turns out that this error is caused by a mistake in preprocessing the processors. Apologize for raising this error!
May I know how did you compute the 95% confidence interval in the paper? I calculated sari on the asset test set (muss/muss_system_outputs/asset/test/muss_bart_access_wikilarge_mined) using easse and the result is
sari = [44.007, 44.656, 44.508, 43.521, 44.047]
.
np.mean(sari),np.std(sari,ddof=1)
gives result (44.147800000000004, 0.4502429344254058)
.
Hence 95% confidence interval is mean +- 1.96*np.std(sari,ddof=1)/np.sqrt(5)
which is 44.15+-0.39. However the result in the paper is 44.15+-0.56.
Thank you in advance!
Hi @pelican9 ,
I used this method:
def get_mean_confidence_interval(data, confidence=0.95):
data = np.array(data)
a = 1.0 * data
a = a[~np.isnan(a)]
n = len(a)
se = scipy.stats.sem(a)
h = se * scipy.stats.t.ppf((1 + confidence) / 2.0, n - 1)
return h
I see. Thank you!