This is a tiny utility I wrote to keep a backup copy of every episode of my favorite shows before they disappear from the internet. Initially, it only supported S3 compatible backends but was later extended to also store podcasts to the local filesystem and Dropbox.
./podcast-archiver --config config-file.yaml
Podcast Archiver can be configured with a configuration file written in YAML. It basically sets information about what feeds should be downloaded, into what folder, and where everything should be stored (the "sink").
The example below would download all episodes it can find in the Changelog
master feed and download them into ./data/changelog-master
.
sink:
- filesystem_folder: "./data"
feeds:
- folder: "changelog-master"
url: "https://changelog.com/master/feed"
The feed also contains an optional field with the name filename_template
.
With this field you set a template through which the final filename will
generated. This template is then parsed using the
text/template package of Go.
Within the template you can use the following properties:
.Feed
: gofeed.Feed.Item
: gofeed.Item.Enclusure
: gofeed.Enclosure
Additionally, the following functions are available:
fileName(v string)
slugify(v string)
formatDate(ts time.Time, format string)
The following fields are available for the sink:
google_project_id
bucket
filesystem_folder
access_key_id
access_key_secret
region
If you want to upload the podcasts to a Google Cloud Storage bucket, you'd need to set the following fields:
google_project_id
bucket
The credentials are taken from the environment using Google Application Credentials as documented here.
Here the following fields must be set:
bucket
access_key_id
access_key_secret
region
If you want to get notified whenever a new podcast has been archived, you can set the following environment variables to send a message to a pre-defined Matrix room:
MATRIX_HOMESERVER
: URL of the homeserver the user is registered atMATRIX_USERNAME
MATRIX_PASSWORD
MATRIX_ROOM
: The complete room name (!...@...
)