TheRadialActive / rssbot

Python bot that downloads news from a list of RSS feeds, converts them to docx format striping all HTML clutter and organizes them in folders according to publishing date.

Home Page:https://joandelason.github.io/rssbot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rssbot

Python bot that downloads news from any RSS feed, converts them to docx format after striping all HTML and ads clutter -sth. like the Safari Reader mode- and organizes them chronologically in folders according to the posts' publishing date.

##Screenshots

overview

overview2

##Use cases:

  • keep track of any hot topic in an organized way (like Bitcoin).
  • compare news coming from different media
  • create an archive of news diggests

Maybe journalists will find it useful if there is no better (and free) tool.

##Current features:

  • scrape from a list of RSS feeds
  • ads and HTML clutter removal
  • docx conversion
  • chronological organization in folders

##Nice to have: If you found this project interesting and want to contribute, these are the main ideas that come to my mind:

  • add tests
  • pass RSS url and root location as arguments, do not touch code
  • package the app in a bundle, so no need to install dependencies
  • improve docx formatting (bold, italics, etc)
  • add the post's main image (if any)
  • PDF conversion
  • add tags feature to be able to organize them by sub-topic (call an AI web service like IBM's Watson)
  • add GUI to set up bot (feed's URL, refresh frequency, root location, ...)

##Installation: There are several dependencies that need to be installed:

  • beautifulSoup4
  • docx
  • readability

##Usage: Just type on the command line:

python newsbot.py

You may want to change the feeds list in the code according to your interests and rename the root folder.

About

Python bot that downloads news from a list of RSS feeds, converts them to docx format striping all HTML clutter and organizes them in folders according to publishing date.

https://joandelason.github.io/rssbot

License:GNU General Public License v3.0


Languages

Language:Python 100.0%