notnews / top_news

Collecting URLs Daily From News Feeds of Major National News Sites

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Top News! URLs from News Feeds of Major National News Sites

We automatically pull daily news data from major national news sites: ABC, CBS, CNN, LA Times, NBC, NPR, NYT, Politico, ProPublica, USA Today, and WaPo using Github Workflows. Refer to the respective json files for the latest version.

Script for downloading article text and parsing some features, e.g., publication date, authors, etc. https://gist.github.com/dwillis/7e6a2571d64688243879ed349e88787c

The June 2023 full text dump is here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZNAKK6

About

Collecting URLs Daily From News Feeds of Major National News Sites


Languages

Language:Python 100.0%