MustansirZia / serverless-link-preview

A serverless, scalable service to get website description and preview deployed on AWS Lambda.

Home Page:https://mustansirzia.com/posts/link-preview/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

serverless-link-preview.

MIT Licence

A serverless, scalable website preview service built using Node.js, Express.js, memory-cache and deployed using Up.


This repository is basically a follow up to an article that I wrote here.

It is a RESTful API service (a microservice) that will take in a website URL and reply with its title, description, a thumbnail preview of the first image found on the website along with the site name. Scrapping is done using @nunkisoftware/link-preview. It is serverless and runs on AWS Lambda as Function as a Service (FaaS). Since there is no server or other hardware considerations it can scale to mammoth proportions as Amazon will automatically deploy copies of our exported functions depending on the load.

Installation.

• Install Up globally.
$ npm i -g up

• Then, you've two choices. Either clone this repo, install local dependencies skip to the very last step.
$ git clone https://github.com/MustansirZia/serverless-link-preview

$ npm i


OR follow along,

• First, initialise the project yourself by creating these files.
$ touch package.json up.json app.js

• Then, add a few local packages.
$ npm i express memory-cache cors @nunkisoftware/link-preview --save

• Add a scripts section to your package.json so Up knows how to start your express server.

{
  "name": "serverless-link-preview",
  "version": "0.0.1",
  "description": "Serverless service to get website description and preview deployed on AWS Lambda.",
  "main": "app.js",
  "license": "MIT",
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {
    "@nunkisoftware/link-preview": "^0.2.0",
    "cors": "^2.8.4",
    "express": "^4.16.2",
    "memory-cache": "^0.2.0"
  }
}

• Write an express server inside app.js with a single GET endpoint at / which would take a query param url. This would be our website url whose preview we require.

const express = require('express');
const linkPreview = require('@nunkisoftware/link-preview');
const mCache = require('memory-cache');
const cors = require('cors');

const app = express();

// Apply cors to provide asynchronous access from browsers.
app.use(cors());

// Validation middleware to simply check the url query param.
const validate = function (req, res, next) {
    const url = req.query.url;
    if (!url) {
        res.status(400).json({ message: 'url query param missing.' });
        return;
    }
    next();
};

// Function which returns an in memory cache middleware.
const cache = function (duration) {
    return function (req, res, next) {
        const key = req.query.url;

        // Try to get cached response using url param as key.
        const cachedResponse = mCache.get(key);

        if (cachedResponse) {

            // Send cached response.
            res.json(cachedResponse);
            return;

        }

        // If cached response not present,
        // pass the request to the actual handler.
        res.originalJSON = res.json;
        res.json = function (result) {

            // Cache the newly generated response for later use
            // and send it to the client.
            mCache.put(key, result, duration * 1000);
            res.originalJSON(result);

        };
        next();
    };
};

// Actual get handler with cache set to 3 minutes.
app.get('/', validate, cache(180), function (req, res) {
    const url = req.query.url;

    // Get the actual response from link-preview.
    linkPreview(url)
          .then(function (response) {

              if (!response.title) {
                  // If the url given is incorrect.
                  res.status(400).json({ message: 'Invalid URL given.' });
                  return;
              }

              res.json(response);
          })
          .catch(function (err) {
              res.status(500).send('Internal Server Error.');
          });
});

// Listen on the port provided by Up.
app.listen(process.env.PORT || 3000);

Please note that we also employ an in memory cache to store recent website previews so we don't query link-preview on every frequent homogenous request (as that's a time/resource expensive thing to do) and thus serve the cached result to our client.

The following two steps can also be accomplished using environment variables but making a separate file is much cleaner and will make our deployment super easy by writing a single command, Up.

• Add a single entry to our up.json so Up knows where and how to find our AWS credentials.

{
  "profile": "aws"
}


This is a one time step and won't be required for subsequent Up deployments.

• Finally, create the aws credentials file at ~/.aws/ and fill in your IAM credentials.

$ mkdir -p ~/.aws && touch ~/.aws/credentials
$ gedit ~/.aws/credentials or $ nano ~/.aws/credentials and paste the following in.

Replace $YOUR_ACCESS_ID and $YOUR_ACCESS_KEY with your own. Find them from here. It could be beneficial to create a new IAM user just for this purpose.

[aws]
aws_access_key_id = $YOUR_ACCESS_ID
aws_secret_access_key = $YOUR_ACCESS_KEY

Save the file and that's it.

To verify our installation, key in npm start from the directory that houses our app.js.
From another terminal window, request our service like so.
$ curl localhost:3000?url=https://www.youtube.com/watch?v=NUWViXhvW3k You should see a familiar JSON and this verifies our installation.

{
  "url": "https://www.youtube.com/watch?v=NUWViXhvW3k",
  "image": "https://i.ytimg.com/vi/NUWViXhvW3k/maxresdefault.jpg",
  "imageWidth": null,
  "imageHeight": null,
  "imageType": null,
  "title": "Building the CLEANEST Desk Setup!!!",
  "description": "My setup tour: https://goo.gl/nv0nja ADD ME ON SNAPCHAT TO STAY UP TO DATE WITH MY SETUP PROGRESS: Snapchat: Kenneth.YT or KDKHD SNAP CODE: http://kennethkre...",
  "siteName": "YouTube"
}

Inside the same directory, deploy the service with a single command.

$ up



After the deployment is complete, get the service's URL like so.
$ up url

The url with the query param could look similar to this.
https://hfnuua77fd.execute-api.us-west-2.amazonaws.com/development?url=https://www.youtube.com/watch?v=NUWViXhvW3k

And there you have it, your own serverless and scalable website preview service built and deployed on AWS Lambda.
Query with your favourite http client inside any application.

Further Reading.

• Documentation for Up.

License.

MIT.

About

A serverless, scalable service to get website description and preview deployed on AWS Lambda.

https://mustansirzia.com/posts/link-preview/

License:MIT License


Languages

Language:JavaScript 100.0%