kwansupp / publication-date-extract

script for extracting publication dates from (news article) links, provided csv with a column of url.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

publication-date-extract

Simple script for extracting publication date from webpages. Takes csv file input, where first column contains URL links to webpages. Outputs output.csv file, which is the input file with an added column of publication date.

This script uses the htmldate package.

Usage

To use script, clone repo then make sure requirements are installed.

git clone https://github.com/kwansupp/publication-date-extract.git

pip install -r requirements.txt

Run script file on csv file:

python extract_publication_date.py <csv_input>

About

script for extracting publication dates from (news article) links, provided csv with a column of url.


Languages

Language:Python 100.0%