contributor / SecurityNow-podcast-archiver

Generate an RSS feed containing all of the episodes of the Security Now podcast

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SecurityNow Podcast Archiver

This script creates an RSS feed containing all the Security Now podcasts.

Security Now is a podcast on computer security. For details see https://www.grc.com/securitynow.htm and https://twit.tv/sn .

Security Now has been running for several years and there are a few hundred episodes. A lot of the information that is discussed is background security stuff, i.e. how things work, so that is information that stays interesting even if it is a few years old. Unfortunately the RSS feed on twit.tv only contains the 10 latest episodes. This script can be used to generate an RSS feed containing all or a large part of the episodes, to be used with your favourite podcast listener.

How to use

  • Download this repository and run the generate-snarchive.py script.
  • Put the generated snarchive.xml file in your Dropbox/Google Drive/other cloud drive.
  • Create a public link to the file
  • Paste the link in your podcast player as RSS feed.

Details

Download this repository and run the generate-snarchive.py script. It requires a Python 3 interpreter and the following dependency modules: (Use e.g. pip3 install --user <modulename> to install modules)

  • tzlocal
  • bs4 (BeautifulSoup 4)

Run the script with python3 generate-snarchive.py.

This will scrape the pages at grc.com to find all episodes, and generate an RSS feed file named snarchive.xml.

Put the file somewhere online. Use your own webserver, or use an online storage service, for example Dropbox, a Google Drive, or any other web storage service.

Create a public link to the file. In the case of Dropbox, the link will end with dl=0, change that to dl=1 otherwise it won't work. If you paste the link in your browser you should only see the content of the snarchive.xml file, not a login screen or a nice looking page generated by Dropbox or Google or whatever. Your browser may recognize that it is an RSS feed and offer you if you want to subscribe to it.

Paste the link in your favourite podcast player as RSS feed, and you will now be able to easily listen to all the old Security Now podcasts.

How it works

This script scrapes the HTML of the pages at www.grc.com/securitynow.htm, and www.grc.com/sn/past/2005.htm, .../2006.htm, etc. That means that if the layout of those pages change this script will break. The actual download links for the MP3 files are at twit.tv, which is the publisher. If those links change, this script will also be broken.

(There is actually an API for twit.tv, so it is possible to get at the needed data without scraping web pages, but that API was a lot more trouble to use than just scraping. Also it would require an account, which is pretty stupid for getting information that is already public. Fortunately the grc.com pages are pretty simple to parse.)

@Leo: Please add archive feeds to twit.tv!

About

Generate an RSS feed containing all of the episodes of the Security Now podcast


Languages

Language:Python 100.0%