CUBigDataClass / Indian-Premier-League

This project aims to show some cool intresting facts, records, results of IPL

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Indian-Premier-League

This project aims to show some cool intresting facts, records, results of IPL

Team Meambers:

Nithin Veer Reddy

Abhinivesh Palusa

Lokin Sai Makkenna

Mohan Dwarampudi

Extract

  • Data has been sourced from multiple areas -
    • Scrapping from popular cricketing websites.
    • Scrapping wiki pages.
    • Through Google API for Geo points.
    • Kaggle datasets.
  • All the data is then stored in Amazon S3, which is then pushed into DynamoDB. S3 event invokes AWS Lambda which does the data parsing before it is rested in DynamoDB.
  • The entire data has been utilized into three tables in DynamoDB, namely - deliveries, matches, players. This data acts as a source of truth for all the further operations.

Transform

  • All the semi-parsed data is transformed into a meaningful entry - JSON.
  • Triggers on DynamoDB would invoke AWS Lambda whenever a new entry is added into DynamoDB.
  • AWS Lambda transforms the data into meaningful patterns, which are further loaded into ElasticSearch cluster.
  • AWS Lambda also fetches additional Geo data through Google API.
  • AWS Lambda uses Redis for a quick Key: Value mapping lookup.

Load

  • ElasticSearch indices all the incoming data from Lambdas.
  • Data on ElasticSearch is split on the nodes in the cluster.
  • All the 3 formats of the data are stored in different indices -
    • deliveries
    • matches
    • players

ElasticSearch

  • A two node cluster, served out via Load Balancer.
  • Load Balancer endpoint would be the face of the ElasticSearch.
  • There are separate indices for all three types of data sources which are mentioned before.

Kibana

NGINX

  • Utilized for the purpose of port forwarding.
  • It forwards the request received on 80 to the Kibana's listening port, making Kibana as the face of the application.

URL for our project: http://bdaipl.tech/

About

This project aims to show some cool intresting facts, records, results of IPL


Languages

Language:Python 74.5%Language:Jupyter Notebook 25.5%