amrael / gglsbl-rest

Dockerized REST service to look up URLs in Google Safe Browsing v4 API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gglsbl-rest

This repository implements a Dockerized REST service to look up URLs in Google Safe Browsing v4 API based on gglsbl using Flask and gunicorn.

Basic Design

The main challenge with running gglsbl in a REST service is that the process of updating the local sqlite database takes several minutes. Plus, the sqlite database is locked during writes, so that will essentially cause very noticeable downtime or a race condition that delays the updates if a single sqlite file was used.

So instead what gglsbl-rest does is to keep two sets of sqlite databases, and while one is being used by the REST service the other is updated regularly by a chron job. Once the update on done on the secondary sqlite file, it starts being used by the REST service for any new requests.

The current implementation does not use volumes to store the sqlite files, but it could very easily be made to do so. I have found that both running locally on my laptop and on AWS ECS performance was not significantly improved by using a volume, but YMMV.

Environment Variables

The configuration of the REST service can be done using the following environment variables:

  • GSB_API_KEY is required and should contain your Google Safe Browsing v4 API key;

  • WORKERS controls how many gunicorn workers to instantiate. Defaults to twice the number of detected cores plus one.

Building and Running

Clone this repository to your machine and make it the default directory. Then, build the docker image with the following command:

docker build -t gglsbl-rest .

Then, you can run a new container based on that image by executing, for example:

docker run -e GSB_API_KEY=<your API key> gglsbl-rest -p 127.0.0.1:5000:5000 -i

This will cause the service to listen on port 5000 of the host machine. Please realize that when the service first starts it downloads a new local partial hash database from scratch before starting the REST service. So it might take several minutes to become available. By starting it in interactive mode you can read the log output to notice when the gunicorn processes start.

Querying the REST Service

The REST service will respond to queries for /gglsbl/v1/lookup/<URL>. Make sure you percent encode the URL you are querying. If no sign of maliciousness is found, the service will return with a 404 status. Otherwise, a 200 response with a JSON body is returned to describe it.

Here's an example query and response:

$ curl "http://127.0.0.1:5000/gglsbl/v1/lookup/http%3A%2F%2Ftestsafebrowsing.appspot.com%2Fapiv4%2FANY_PLATFORM%2FSOCIAL_ENGINEERING%2FURL%2F"
{
  "matches": [
    {
      "platform": "ANY_PLATFORM",
      "threat": "SOCIAL_ENGINEERING",
      "threat_entry": "URL"
    },
    {
      "platform": "WINDOWS",
      "threat": "SOCIAL_ENGINEERING",
      "threat_entry": "URL"
    },
    {
      "platform": "CHROME",
      "threat": "SOCIAL_ENGINEERING",
      "threat_entry": "URL"
    },
    {
      "platform": "LINUX",
      "threat": "SOCIAL_ENGINEERING",
      "threat_entry": "URL"
    },
    {
      "platform": "ALL_PLATFORMS",
      "threat": "SOCIAL_ENGINEERING",
      "threat_entry": "URL"
    }
  ],
  "url": "http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/SOCIAL_ENGINEERING/URL/"
}

There' an additional /gglsbl/v1/status URL that you can access to check if the service is running and also get some indication of how old the currently sqlite database is:

$ curl "http://127.0.0.1:5000/gglsbl/v1/status"
{
  "alternatives": [
    {
      "active": true,
      "ctime": "2017-06-05T05:08:29+0000",
      "mtime": "2017-06-05T05:08:29+0000",
      "name": "/root/gglsbl-rest/db/gsb_v4.a.db",
      "size": 1592377344,
      "switch": "a"
    },
    {
      "active": false,
      "ctime": null,
      "mtime": null,
      "name": "/root/gglsbl-rest/db/gsb_v4.b.db",
      "size": null,
      "switch": "b"
    }
  ],
  "environment": "prod"
}

About

Dockerized REST service to look up URLs in Google Safe Browsing v4 API

License:Apache License 2.0


Languages

Language:Python 100.0%