anil4aws's repositories
bigdata-realtime-twitter-analysis
Personal project where I perform some analytics (including Sentiment Analysis) over a Twitter Stream using Big Data Technologies of the Hadoop echosystem such as Flume, Kafka, and Spark Streaming.
data_engineering_project_template
A template repository to create a data project with IAC, CI/CD, Data migrations, & testing
DataPipeline
Real time stock data pipeline --play with Kafka, Cassandra, Spark, Redis, Node.js, Zookeeper
freshjobsPipeline
ETL Pipeline using Spark, Airflow, & EMR
github-slideshow
A robot powered training repository :robot:
goodreads_etl_pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
josephmachado-data-engineering
Profile readme
Realtime-Data-Analytics-Using-Spark
Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc
realtime-twitter-trends-analytics
A big data project to develop a real-time data pipeline for analyzing the popularity and sentiments of trending topics on Twitter.
spark-etl-pipeline
Various data stream/batch process demo with Apache Scala Spark 🚀
Spark-with-Python---My-learning-notes-
ETL pipeline using pyspark (Spark - Python)