With this file you can find a docker-compose.yml
file containing the
needed configuration to mount a docker container with all the setup ready
to run this project.
The docker project opens a connection on port 8000 so you can go visit the url http://localhost:8000.
In this webpage you can find different things:
- A monitorizing tool where for each server you find the counter of views for a certain video that the server has stored (the growing counter)
- A little web form so you can send a certain number of views to a node
- A button to prepare a test where each server recives a random amount of views
The webpage fetches every two seconds information from the server.
First of all I've researched on the topic of CRDTs. I've found that it works because what it's really being build is a lattice structure. A lattice is in mathematics a structure where every two elements have supremum and infimum, both uniques. What we are doing with the counter is to define the supreme operation as the maximum of two numbers that verifies the properties of being idempotent and transitive. For this reason asynctotycally what we are doing is to reach the total number of views registered individually by each server.
Each node in the network is represented by an instance of the class
Server
. Each server has a unique id, a variable being True
when
the server is active on the network and a dictionary of the known
state of the network.
The CRDT-counter operations are implemented in this class, those are:
incr(video_id)
: Increment the counter for the video in the current node by 1.incr_by(video_id, amount)
: Increment the counter for the video in the current node byamount
.set_to(video_id, amount)
: Set the views counter to a certain amount.total(video_id)
: Total number of views received in the node.count(video_id)
: Total number of views of the video in the network (as known by now by the node).merge(server)
: The operation that using the max function updates the state of the local counter.
The class Cluster
is the one representing the network that all the
servers build. Its task in the program is to build each server,
populate it's list of servers and communicate with the frontend for
tasks like deleting a node or watching a video in a node (the idea is
that this class acts like the load balancer or the router that matches
your location with the right server).
The main script is a sanic server. I've choosen sanic because is a flask like framework that can work asyncronously. I've created a task with it that every 2 seconds performs the merge operation for a certain server each time (to test the interface working only one node communicate with 5 nodes each time). Also sanic provides the API endpoints for the web interface to operate on the program. I've chosen to build a web interface because it emulates the activity proposed in the exercise while I can easily manage the state of the system.
Firstly I implemented the system for just one video but the
generalization for a bigger number of videos was easy to implement.
For this reason each instance of Server
doesn't assume that the number of videos
is 1 but to make easy the web interface I just made available one
video.
With the code provided I belive to achieve the task proposed:
- Read up on CRDTs and specifically Grow-Only counters
- Write a program that simulates the distributed system of multiple independent nodes
- Simulate the process of a fixed number of page views being sent to an arbitrary node
- Implement a Grow-Only counter that is used to respond to incoming page views
- Build a simple frontend visualization of the page counter
In my implementation the nodes are "interconnected" in the way that they have direct access to their data so they just "talk" with each other asking for the info (it simulates a distributed system but is not a distributed system).
If this is an error on how I understood the problem and alternative solution will reimplement the system being distributed as:
- Each instance of the class
Server
is a different programm running isolated. - Each server has an API endpoint where I can
POST
a json object containing pairs of videos ids and the number of views. - The merge operation is implemented in that API endpoint but this time I use the information provided instead of just going and look for it, as I do now.
- Eeach server should now know where to locate each other, let's say by knowing their ports and ip's.
- Now the cluster class is not necessary.
What can happen when the user reloads the website? Can we synchronize the counter not only across nodes but also across the nodes and the frontend?
When the user reloads the website the site should show the same or a greater counter depending on the updated counter. We can introduce the frontend as one more server and sync the counter with it too. In this case when the user reloads the website we can expect that the program destroys the frontend from the list of connected servers loosing its counter.
What happens if a node gets removed from the cluster? What happens if more nodes get added to the cluster?
If a server gets removed from the cluster we'll loose the counter information corresponding to this node unless it syncronizes it's information with at least one node before being destroyed. If a node gets added to the cluster it's inmediatly known by the rest of servers and after a certain period of time it'll retrive the up to date information.