AhmadMuzakkir / go-redis-elasticsearch-example

An example of using Redis as a message queue to index into ElasticSearch, using Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Redis and ElasticSearch with Go

This project shows an example how to design a system.

Below is the system description:

Design a basic analytics backend that can store and query hits. A hit is an interaction that results in data being sent to the backend.

A hit is identified by a unique ID of type string. 

The system should provide two APIs:
1. Track (store) a hit.
2. Retrieve a hit's counts for the last 5 minutes, 1 hour, 1 day, 2 days and 3 days.

Refer to endpoints documentations for more information.

System Design

In this section, we will explain how the system is designed.

We make some assumptions and estimations about the system.

  • The system is very write heavy, with the read-to-write ratio of 1:100.
  • A few minutes of write-to-read delay is acceptable.
  • The system must be be to handle at least 10,000 hits per second.
  • The Track API must be very fast, the expected 90th latency must be below 100ms.

Functional Requirements

  • The system has to track hits for an ID.

  • The system has to return hit counts for an ID.

Non-Functional Requirements

  • Scalable: the system should handle high throughput i.e. thousands of hits per seconds. It should also handle spiky traffic.
  • Availability: the system should be fault tolerance. It's acceptable if the data returned is stale.
  • Eventual Consistency: we choose eventual consistency since the data does not require strong consistency, and some delay is acceptable.

Data Model

We can either store individual hits or aggregate hits in real time.

We choose to store individual hits since it's the simplest solution, and more flexible. The trade off is reads may be slow, and it will consume more storage.

Below is the hit model:

{
  "id": "string",
  "timestamp": "string"
}

The query that will be performed is an aggregation based on time ranges.

High Level Details

Based on the information above, below is the system design details:

  • We use ElasticSearch to store the hits. ElasticSearch has high scalability built in and support complex aggregation queries based on time range.
  • Message queue is implemented using Redis. Protobuf is used as to serialize.
  • The API server will receive the HTTP requests and insert the hits into the message queue.
  • The Indexer will consumed the hits (in batches) and index them into ElasticSearch (using Bulk insert).

Endpoints

Track - POST /analytics

Track a hit.

Request

{
  "id": "id"
}

Response

No Content

Retrieve - GET /analytics/{id}

Retrieve hits counts for a hit.

Response

{
  "id": "1",
  "counts": [
    {
      "reference": "3 days ago",
      "count": 3
    },
    {
      "reference": "2 days ago",
      "count": 3
    },
    {
      "reference": "1 day ago",
      "count": 3
    },
    {
      "reference": "1 hour ago",
      "count": 1
    },
    {
      "reference": "5 minutes ago",
      "count": 1
    }
  ]
}

How to run

Prerequisites:

  • Docker Compose 3.7+
  • Docker Engine 18.06.0+
# build the docker image
[terminal 1] bash scripts/build.docker.sh

# run docker compose
[terminal 1] docker-compose up --build

Assuming default configurations, following are the available endpoints:

To clean up, you can do:

[terminal 1] docker-compose down -v

Using Makefile

To simplify things, you can use the Makefile.

Build the docker image and run the docker compose.

[terminal 1] make run-docker

Stop and clean up the docker compose.

[terminal 1] make clean-docker

Running all the tests

Prerequisites:

  • Docker Compose 3.7+
  • Docker Engine 18.06.0+

Some of the tests require ElasticSearch server. It's expected that the server is reachable at http://127.0.0.1:9200/

To make it easy, there's provided script that will start the ElasticSearch server in a Docker container and run the tests.

[terminal 1] bash run.tests.sh

Benchmark Tool

You can spam Track request using benchmark tool in cmd/benchmark directory.

The tool will spam Track requests as fast as possible, using a number of workers.

The default number of workers is 1, and you set it using -w flag or env.

The default hit ID is 1, and you can set it using -i flag or env.

go run cmd/benchmark/main.go -w 1 -i hitid

About

An example of using Redis as a message queue to index into ElasticSearch, using Go


Languages

Language:Go 81.6%Language:Shell 15.6%Language:Dockerfile 1.5%Language:Makefile 1.3%