mav3rick177 / Real-Time-Data-Analytics

Interactive analytics for Reddit | Real-Time data analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Real-Time-Data-Analytics

Interactive analytics for Reddit | Real-Time data analysis. The purpose of this project was to get metrics about Reddit posts in Real time using various technologies such as Angular, Apache Kafka, Spring KStream, Apache Spark, Spring Kafka Spring Webflux with Reactor.
It contains five main components:

  • client which is the front end app using Angular.

  • a web service (Reddit-producer) that calls the Reddit API and gets posts. it then sends it to a Kafka topic. Here is an example of a Rest call to the reddit API: Rapport_stage_d'application67

  • two Consumers (Kafka-Stream-Consumer and Spark-Consumer) that are basically stream processors. These get the data from Kafka as a Stream, and process it in Real-time, producing metrics and statistics that are put back in a Kafka Topic metrics to be consumed later.

  • Spring-Kafka-Reactive-Backend is a service that is connected to the reddit metrics topic and waits sends it to the frontend using a websocket.

Architecture

Rapport_stage_d'application66 Note: Hive wasn't used in this project.

Screenshots

Rapport_stage_d'application Rapport_stage_d'application 4jpg Rapport_stage_d'application3 Rapport_stage_d'application2 Rapport_stage_d'application9 Rapport_stage_d'application8 Rapport_stage_d'application7 Rapport_stage_d'application6 Rapport_stage_d'application5

About

Interactive analytics for Reddit | Real-Time data analysis


Languages

Language:CSS 48.0%Language:Java 23.0%Language:TypeScript 15.8%Language:HTML 10.5%Language:JavaScript 2.7%