2blam / textrank-study-python

A Study of the TextRank Algorithm in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Study of the TextRank Algorithm in Python

TextRank is a key phrase and sentence extraction algorithm based on PageRank. I've made an IPython notebook to demonstrate how to implement the key phrase extraction part of it using the networkx and NLTK packages. I also use matplotlib to visualize the graph.

Lightning Talk Slides

The slides of my lightning talk on the same subject can be found here.

Just Reading? No Installation Required

GitHub now renders IPython notebooks, and so you don't have to install anything to view it. Simply click on the notebook file or use this link.

How to Use the Notebook on Your Machine

It's strongly recommended that you create a virtualenv for experimentation, but in short, you need to install the following packages with pip:

pip install ipython[notebook] nltk networkx matplotlib

Then open the notebook:

ipython notebook "Key Phrase Extraction with Python.ipynb"

The notebook has been tested with both Python 2.7.9 and Python 3.4.3 on OS X 10.10 and 10.11, but there shouldn't be issues if you run this on a different operating system.

About

A Study of the TextRank Algorithm in Python


Languages

Language:Jupyter Notebook 100.0%