Add evaluation harness
blester125 opened this issue · comments
Brian Lester commented
Right now the results are just qualitative, I want to add an evaluation script that at least can be used for keyword extraction with F1. A more complex one for summarization (rouge, etc) would also be nice