Sourabh7iwari / RealtimeDatapipeline

Real Time ETL pipeline to extract data from sensor's. Transform it and feed it to the postgres database

Repository from Github https://github.comSourabh7iwari/RealtimeDatapipelineRepository from Github https://github.comSourabh7iwari/RealtimeDatapipeline

create virtual environment & activate it

python3 -m venv .venv &&
source .venv/bin/activate

install dependencie

pip install -r requirements.txt

give permission to setup script

chmod +x run.sh

run the setup script

./run.sh

now setup is complete and our spark job is listening for incoming sensor data to feed to postgres sensordb

run the data generation py script in another terminal and make sure to activate the .venv there also

python3 sensor_data_generator.py

checking data, quering and pre analysis

open pgadmin gui at localhost:8080

connect to postgres server for credentials check docker-compose.yml and query the database

About

Real Time ETL pipeline to extract data from sensor's. Transform it and feed it to the postgres database


Languages

Language:Python 87.2%Language:Shell 12.8%