BastinFlorian / Keywords_extraction_with_GOW

Graph of words (Networkx) and keywords extraction (Ktruss, Kcore, DivRank, BestCoverage)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Keywords_extraction_with_GOW

  • First we present an example of the methods used to extract keywords (see Graph of words and keywords extraction.ipynb and K-truss_code_example.ipynb)
  • Then we give a code to compute the k_core and obtain the graphs of directories of files or all files in directories containing sub-directories (see K_core_corpus.py)
  • We also give an implementation of the K-truss algorithm (see K-truss_code.py)
  • We make a time analysis to see the evolution of some words through time, in order to detect events related to them.

Libraries

  • Networkx to create and vizualize graphs
  • Spacy to preprocess the text

Papers implemented :

Graph of words and keywords extraction.ipynb

This notebook is dedicated to people who want to extract keywords from text document or corpus documents using a graph approach.

The goal of this notebook is to extract keywords from a text file using four different approachs :

Through a french summary of Games of Thrones, we bring an example of the outputs of the four different approaches.

K-truss_code_example.ipynb

This jupyter notebook is an example of the following script

K-truss code

Two functions are implemented.

  • The first one compute the K-truss of each node in G, the maximum non empty subgraph, the k from the maximum non empty subgraph and the necessary informations to compute the density and inflexion method.
  • The second one gives the k-truss subgraph of the graph, where k is given as an input

About

Graph of words (Networkx) and keywords extraction (Ktruss, Kcore, DivRank, BestCoverage)

License:MIT License


Languages

Language:Jupyter Notebook 99.4%Language:Python 0.6%