DerwenAI / pytextrank

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction

Home Page:https://derwen.ai/docs/ptr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Information about the matrix similarity

EmanueleGusso opened this issue · comments

Hi everyone,
First of all I'd like to thank you for the amazing work you've done so far.
I have a question regarding the extractive summarization via pytextrank.
To apply the algorithm, we start from a matrix M (num_sentences x num_sentences) and we fill the matrix, often with a similarity measure between the two sentences in question.
In the case of pytextrank, what is the embedding used on the sentences?
I really hope you can help me.
Thank you in advance for your availability!

Thank you @EmanueleGusso -
This project is about implementing the textgraph family of algorithms, primarily for entity extraction – although some variants have a "side-effect" usage in extractive summarization. That said, we weren't aiming for extractive summarization in general, or expanding on summarization.

Also @EmanueleGusso , if it helps - here's the primary source https://derwen.ai/docs/ptr/biblio/#mihalcea04textrank for Mihalcea (2004) at EMNLP. The analysis of extractive summarization is included there.