Big Data Journal Projects's repositories
Airline_Data_Analysis
Process to gather streaming data from Airline API using NiFi & batch data using AWS redshift using Sqoop and build a data pipeline to analyse the data using Apache Hive and Druid and compare the performances ,to discuss the hive optimization techniques and visualise the data using AWS Quicksight
HeartRate-Monitoring-using-AWS-IOT-and-AWS-KINESIS
you run a script to mimic multiple sensors publishing messages on an IoT MQTT topic, with one message published every second. The events get sent to AWS IoT, where an IoT rule is configured. The IoT rule captures all messages and sends them to Firehose. From there, Firehose writes the messages in batches to objects stored in S3. In S3, you set up a table in Athena and use QuickSight to analyze the IoT data.
awesome-opensource-data-engineering
An Awesome List of Open-Source Data Engineering Projects
AWS_File_Trans_Lamda_S3_SNS
AWS Data Engineering Project using Lambda, S3 and SNS
amazon-kinesis-data-analytics-blueprints
Kinesis Data Analytics Blueprints are a curated collection of Apache Flink applications. Each blueprint will walk you through how to solve a practical problem related to stream processing using Apache Flink. These blueprints can be leveraged to create more complex applications to solve your business challenges in Apache Flink.
aws-glue-cdk-cicd
Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines
aws-glue-test-data-generator
AWS Glue Configurable Test Data Generator
aws-security-hub-glue-aggregator-terraform
These Terraform modules aggregate Security Hub findings to centralized account using Amazon Kinesis Firehose and AWS Glue
bigdata-file-viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
ClickHouse
ClickHouse® is a free analytics DBMS for big data
data-engineering
Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.
data-engineering-zoomcamp
Free Data Engineering course!
data-science-on-aws
AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker
emr-studio-notebook-examples
This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.
monitor-serverless-datalake
Alerting and notification in a serverless data lake during failures