stanfordnlp / string2string

String-to-String Algorithms for Natural Language Processing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Plans to add BLEURT metric?

ogencoglu opened this issue · comments

Would be a great addition.

Thibault Sellam, Dipanjan Das, and Ankur P.
Parikh. 2020. Bleurt: Learning robust metrics
for text generation.

MoverScore too

Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, and Steffen Eger. 2019. MoverScore:
Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance.

Hi @ogencoglu, thank you very much for your question.

Yes, we are actively working on incorporating wrappers for BLEURT, MoverScore, METEOR, and various other automatic string-based evaluation metrics in the near future. Please stay tuned for further updates!

PS. We also warmly welcome our community members to contribute and expand the library based on their expertise and interests; so, please feel free to make your own valuable additions.

Thank you for the swift reply @suzgunmirac ! Appreciated!

Continuing the metrics discussion.

I wonder whether descriptive metrics would enhance this library. For instance something like textdescriptives library. These metrics are not necessarily string2string, they are stuff like readability, coherence, perplexity etc. calculated from a single doc. But a readability vector can be calculated and used as a string2string metric I guess.

Example use case would be semantic search with faiss or any approximate nearest neighbor that returns similar docs and then returned docs can be re-ranked with respect to ease of readability or with respect to similarity of query doc's readability.

This is not a suggestion but just an idea that I wanted to document here, in case you find it relevant.

Thank you very much for suggesting the use of descriptive metrics in our library, as well as sharing a reference to the TextDescriptives library! TextDescriptives seems like a wonderful resource! We are indeed planning to incorporate additional metrics and string measures such as perplexity into our metrics module in the near future.