josephpconley / canned-news

News and blogs packaged fresh each morning.

Home Page:https://hot.canned.news

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Canned News

This project automatically creates a daily email newsletter from new activity in the RSS feeds you follow. Here's an example of an email generated and sent by this project.

Why?

I like RSS feeds and the Feedbin Reader, but I also like getting a daily email summary of my news and blogs. I looked around a bit and didn't find a solution that would send me the latest posts from each of my tags every day, so I set this project up.

Features

  • Uses Feedbin to collect and store articles in a Postgres database.
  • Resolves the final URL for each article.
  • Summarizes article content using Aylien's text analysis API.
  • Gets the number of social media interactions using the SharedCount API.
  • Gets an image for each entry.
  • Sends a daily email to you using the email from your Feedbin account.

Development

Requirements

This project relies on a lot of external services. This allows it to do a lot with very little code, but it's also a decent amount of work to get set up. At some point I'd like to make this easier, but here it is for now:

  • Feedbin - The whole project relies on you having a paid Feedbin account
  • Sendgrid - Sends emails
  • Amazon S3 - Hosts static HTML versions of your emails
  • Aylien - Text analysis
  • SharedCount - Aggregates social data about posts
  • Node/NPM - Setup scripts are run in NPM
  • Docker - Used for local development and deployment
  • Hyper.sh - Container hosting and cron jobs

Architecture

This project includes several Node microservices:

  • Collector (/collector) - Responsible for collecting new entries from Feedbin every hour and updating the list of feeds for each of your tags.
  • URL Resolver (/url_resolver) - Resolves any URL redirects from entries for improved extraction.
  • Summarizer (/summarizer) - Summarizes the article content.
  • Image Extractor (/image_extractor) - Extracts the dominant image url for each article.
  • Socializer (/socializer) - Grabs data from SharedCount about social activity on each article.
  • Mailer (/mailer) - Sends an email based on the preferences set for your newsletters in the tags table.

You can find bash scripts for Docker in the /docker folder and a database schema file in the /database directory. There is also a /shared directory for shared Node scripts.

Local Setup

Prerequisites

Coming soon!

Creating a Newsletter

Newsletters are currently created manually. You have to add the newsletter to the database, to Feedbin, and create a new list segment in Mailchimp. Eventually I'll get around to automating this process, but for now, here's how it works.

1. Add the RSS feed to Feedbin

Most major news outlets and blogs offer RSS feeds. For example, here are RSS feeds for the following news outlets:

Add the feeds you want to Feedbin and tag them. This tag name will become the name of the Newsletter.

2. Create a Segment in Mailchimp

All subscribers are currently stored in a single list with a segment for each newsletter. Create a new group for this Newsletter. Title it the same thing as the tag on Feedbin.

3. Add the Newsletter to your Database

Next the application's database needs to know about the new tag. Create a new Newsletter in the database. Be sure to include the segment ID in the database entry.

Next time you run the collector, articles should be retrieved for the new Newsletter.

Testing

Coming soon!

Deployment

Since this project is changing frequently, I haven't automated deployments yet. There are some scripts that make it slightly easier though.

  • After making updates, build the latest image for all the containers: npm run app:build
  • Push them to Docker Hub: npm run app:push.
  • Create a .env.prod file for each service.
  • Bring up the database container manually (only needs to be done the first time): npm run db:prod:up.
  • Run any sql files in the /database directory.
  • Run the deployer: npm run app:deploy.

You can then manually run each of the services to test them out:

  • Collector: npm run collector:prod:run
  • URL Resolver: npm run url-resolver:prod:run
  • Summarizer: npm run summarizer:prod:run
  • Image Extractor: npm run image-extractor:prod:run
  • Socializer: npm run socializer:prod:run
  • Mailer: npm run mailer:prod:run

Once you verify they're working, you should use a cron job to automatically run these commands.

Run the collector every hour, getting all entries from Feedbin collected within the past 2 hours:

hyper cron create --minute=5 --hour=* --name fbm-collector-cron --env-file $(pwd)/collector/.env.prod --link fbm-postgres-1:postgres karllhughes/fbm-collector

Run the URL Resolver every hour, a few minutes after the collector, getting all posts published in the past 25 hours:

hyper cron create --minute=15 --hour=* --name fbm-url-resolver-cron --env-file $(pwd)/url_resolver/.env.prod --link fbm-postgres-1:postgres karllhughes/fbm-url-resolver

Run the Summarizer every hour, a few minutes after the URL Resolver, getting all posts published in the past 25 hours:

hyper cron create --minute=25 --hour=* --name fbm-summarizer-cron --env-file $(pwd)/summarizer/.env.prod --link fbm-postgres-1:postgres karllhughes/fbm-summarizer

Run the Image Extractor every hour, a few minutes after the Summarizer, getting all posts published in the past 25 hours:

hyper cron create --minute=35 --hour=* --name fbm-image-extractor-cron --env-file $(pwd)/image_extractor/.env.prod --link fbm-postgres-1:postgres karllhughes/fbm-image-extractor

Run run the Socializer every day at 8:45 UTC:

hyper cron create --minute=45 --hour=8 --name fbm-socializer-cron --env-file $(pwd)/socializer/.env.prod --link fbm-postgres-1:postgres karllhughes/fbm-socializer

Finally, the mailer should be run every day at 9:00 UTC:

hyper cron create --minute=0 --hour=9 --name fbm-mailer-cron --env-file $(pwd)/mailer/.env.prod --link fbm-postgres-1:postgres karllhughes/fbm-mailer

Contributing

While this project is primary to scratch my own itch, you're welcome to suggest improvements. Just make a PR or create an issue.

License

Copyright 2017, Karl Hughes

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

News and blogs packaged fresh each morning.

https://hot.canned.news


Languages

Language:CSS 64.2%Language:HTML 20.7%Language:JavaScript 14.6%Language:Shell 0.5%