arbanhossain / devops-iit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Frontend: https://github.com/saadman-sakib/devops-frontend

Architecture Overview

Image of complete architecture

Tools and Technologies

Backend: NestJS, Google Cloud Platform Frontend: NextJS, Vercel Database: PlanetScale, MySQL CI/CD Pipeline: Git, GitHub Actions, Google Cloud Build, Terraform Containerization: Docker, Google Container Registry Monitoring: Google Operations Suite, Vercel Analytics, PlanetScale Insights

Challenge 1

Objective: Dependable CI/CD Pipeline

Git Workflow Diagram

  • Cloud Build Trigger activates on every commit in designated branch
  • Automatically provisions a google compute instance for build environment
  • Runs the configured backend tests
  • Builds docker image from configured Dockerfile
  • uploads the image to Google Container Registry
  • then calls terraform to provision cloud run
  • Terraform pulls the image from container registry and deploys to cloudrun according to the instructions in main.tf
  • We have also configured terrafrom backend so that terraform uses Google Cloud Storage for storing its state files
  • We also authenticated terraform following the recommended IAM policy based way

Challenge 2

Objective: Scalable infrastructure

Two diagrams, one showing container scaling, the other showing database scaling

Database: For the problem at hand, the primary bottleneck seems to be the database. As the system will see a huge spike in traffic only during few days of a month, expending resources on-demand is the better choice. Traditional databases can usually scale only vertically, with horizontal scaling increasing complicacies. Using a distributed database system will allow for dynamic resource allocation. PlanetScale, powered with Vitess, partitions a large database into smaller parts. This reduces load on a single machine, and allows better scaling.

Containers: Google Cloud Platform allows adjusting the number of container instances based on demand. We set a minimum number of instances (1), and set it to scale whenever the need arises. This allows for better resource utilization, and reduces costs.

Challenge 3

Objective: Security

As the system will be handling sensitive data, we implement OTP-based authentication along with JSON web tokens. To prevent brute-force attacks, we implement rate-limiting using Cloudflare's DDoS proxies and GCP's firewall configuration. The database is configured to allow traffic only from the backend. This ensures that the database is not exposed to the internet.

  • Made sure all of our credentials are always going through environment variables set in GCP
  • terraform authentication is made through IAM account instead of username pass
  • Database wont accept traffic from anywhere except backend

Challenge 4

Objective: Monitoring

The Google Cloud Monitoring Stack provides a comprehensive set of tools to monitor the backend.

For tracking errors, profiling performance of code blocks for both the backend and frontend, we use Sentry. Sentry's dashboard provides feedback to the developers from the operations, providing stack traces, detailing performance of functions. These help the developers to figure out bottlenecks on the system, or where the system is failing.

Sentry issues are integrated with GitHub issues, providing a more coherent issue tracking system for the Ops team and the Dev team. The project is also configured to alert the developers on Slack whenever some certain preset conditions are met.

About


Languages

Language:TypeScript 80.6%Language:HCL 18.4%Language:Dockerfile 1.1%