aleksandarskrbic / kafka-leaderboard

An application that pulls data from Github API feeds it to Kafka and builds real-time leaderboard of Apache Kafka contributors.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kafka-leaderboard

System responsible for data ingestion from GitHub API and feeding it into Kafka, data agregation and providing HTTP API for dashboard-ui.

System Architecture

alt text

The system consists of 3 services:

  1. Ingestion Service that pulls data from GitHub API, process it, transform into Avro format and publish it to 3 Kafka topics: ingestion.comment, ingestion.pull-request, and ingestion.review.
  2. Aggregation Service consumes all Kafka topics in parallel, do some processing and aggregations, and write it into in-memory repository. Data is exposed via HTTP API. To get top 100 contrubutors sorted in descending order by points use http://localhost:8080/api/aggregate, to get data about specific contirbutor use http://localhost:8080/api/aggregate/{username}. Since non-persistent in-memory repositry is used, consumers won't commiting any offsets, so when service is restarted it will rebuild state again from all Kafka Topics from the beggining.
  3. Dashboard is simple frontend application written in React.js, that displays top 100 Apache Kafla contributors.

How to run:

  1. Run Zookeper, Kafka, and Schema Registry: docker-compose up make sure these are started properly, if not just kill process(ctr+c) and run script again
  2. Run ./run-ui.sh in order to build and run dashboard-io (may a few minutes)
  3. In order to use GitHub API you need to set your username and access-token in ingestion-service/src/main/resources/application.yml http.client.username=your-github-username and http.client.password = your-access-token.
  4. Run ./run-services.sh to run ingestion-service and aggregation-service
  5. Go to http://localhost:3000 to see Leaderboard of Apache Kafka contributors

About

An application that pulls data from Github API feeds it to Kafka and builds real-time leaderboard of Apache Kafka contributors.


Languages

Language:Java 95.4%Language:JavaScript 1.9%Language:HTML 1.3%Language:CSS 0.9%Language:Shell 0.3%Language:Dockerfile 0.3%