ethangardner / py-keyword-extraction

Extracts keywords from a webpage using the Topia library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Description

Reads a text file with one url on each line to scrape the contents of a web page and extract key terms using natural language processing. Built with python.

Requirements

Instructions

Run the script from the command line. There are a few required options

Required Arguments

  • -i, --input the name of the txt file containing the URLS
  • -c, --content the selector for the content region to parse
  • -o, --output the name of the file to be output. Acceptable formats are csv or json.

Optional Arguments

  • -l, --length the minimum length of each keyword returned by the script

About

Extracts keywords from a webpage using the Topia library


Languages

Language:Python 100.0%