aws-redshift

There are 7 repositories under aws-redshift topic.

alanchn31 / Data-Engineering-Projects
Personal Data Engineering Projects
data-lake ingest-data data-engineering data-warehouse cassandra aws-redshift mongodb scrapy spark airflow postgres data-engineering-nanodegree star-schema data-modeling
Language:Jupyter Notebook 961
tokern / piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake
Language:Python 322
aws / amazon-redshift-python-driver
Redshift Python Connector. It supports Python Database API Specification v2.0.
aws-redshift data-science data-analysis amazon-redshift
Language:Python 216
alanchn31 / Movalytics-Data-Warehouse
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
docker airflow spark sql python3 redshift data-engineering-pipeline data-engineer-nanodegree pyspark data-engineering movie-database aws-s3 aws-redshift analytics data-warehouse-cloud data-modelling movie-recommendation movie-reviews udacity
Language:Python 157
Wittline / uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.
data-engineering python apache-airflow etl-pipeline aws-redshift power-bi data-modeling uber-eats uber-data expenses-dashboard expenses-tracker airflow-docker uber aws
Language:Jupyter Notebook 122
shravan-kuchkula / udacity-data-eng-proj-1
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
airflow aws-redshift data-pipelines docker
Language:Python 90
aws-solutions / clickstream-analytics-on-aws
Clickstream Analytics on AWS source code
aws aws-amplify aws-cdk aws-emr-serverless aws-kinesis-stream aws-msk aws-quicksight aws-redshift clickstream data-analysis aws-clickstream-solution aws-solutions web-analysis web-analytics
Language:TypeScript 84
KentHsu / Udacity-Data-Engineering-Nanodgree
Udacity Data Engineering Nanodegree Program
data-engineering postgresql apache-cassandra aws-redshift aws-s3 apache-spark apache-airflow data-pipelines data-warehouses data-lake data-quality
Language:Jupyter Notebook 52
jackmleitch / StravaDataPipline
:arrows_counterclockwise: :running: EtLT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow
python aws aws-s3 aws-redshift data-engineering postgresql sql airflow
Language:Python 32
AnMol12499 / Reddit-Analytics-Integration-Platform
Project was based on an interest in Data Engineering, ETL pipeline. It also provided a good opportunity to develop skills and experience in a range of tools. As such, project is more complex than required, utilising dbt, airflow, docker and cloud based storage.
airflow aws-redshift aws-s3 dbt docker etl-pipeline google-studio python terraform
Language:Python 30
heroku-examples / analytics-with-kafka-redshift-metabase
An example system that captures a large stream of product usage data, or events, and provides both real-time data visualization and SQL-based data analytics.
aws-redshift metabase kafka heroku data-analytics data-visualization
Language:JavaScript 26
aws-data-pipeline
ismaildawoodjee / aws-data-pipeline
A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from locally hosted Airflow containers. The end product is a Superset dashboard and a Postgres database, hosted on an EC2 instance at this address (powered down):
python docker postgresql airflow elt etl data-engineering aws aws-s3 aws-emr aws-redshift aws-ec2 aws-iam terraform apache-superset infrastructure-as-code data-pipeline sql
Language:Python 23
moritzkoerber / covid-19-data-engineering-pipeline
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
aws aws-ecr aws-glue aws-lambda aws-s3 docker great-expectations pyspark spark api aws-cdk aws-redshift aws-cloudformation apache-airflow apache-spark
Language:Python 23
LoveNui / DataEngineering-Capstone-Project
airflow aws-redshift aws-s3 data-engineering python spark sql
Language:Jupyter Notebook 16
tmheo / spring-data-jpa-redshift-sample
spring boot data jpa integration with aws redshift sample
spring-boot spring-data-jpa aws-redshift
Language:Java 15
kishlayjeet / Zomato-Twitter-Sentiment-Analysis-Data-Pipeline
This project provides valuable customer sentiment insights for Zomato by tracking and analyzing tweets related to their brand and services.
airflow aws-lambda aws-redshift aws-s3 boto3 data-engineering data-pipeline etl nltk pandas psycopg2 python selenium twitter-data-pipeline twitter-sentiment-analysis zomato-data-analysis sentiment-data-pipeline vedar-lexicon zomato-data-pipeline
Language:Python 14
AWS-Big-Data-Projects / Analysing-Census-Data-using-aws
Use aws-emr and aws-redshift to analyse dataset of adult census of USA
aws-emr aws-redshift apache-spark aws-s3
13
vsouza / spark-kinesis-redshift
Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
spark-streaming spark aws aws-kinesis aws-kinesis-stream aws-redshift python shell etl etl-pipeline
Language:Python 11
kishaningithub / rdapp
rdapp - Redshift Data API Postgres Proxy
aws-redshift hacktoberfest redshift aws go golang
Language:Go 10
lenguyenthedat / aws-redshift-to-rds
A simple command-line tool to copy tables from Amazon Redshift to Amazon RDS (PostgreSQL).
aws-redshift aws amazon-redshift rds amazon-rds haskell
Language:Haskell 10
essraahmed / Data-Warehouse-With-Redshift
Data Warehouse with AWS Redshift and Visualizing data using Power BI
aws aws-redshift cloud data data-engineering datawarehouse datawarehousing dwh etl etl-pipeline redshift redshift-cluster redshift-database powerbi analytics sql
Language:Jupyter Notebook 8
taise / Spectrometer
AWS Redshift monitoring web console
aws-redshift monitoring ruby
Language:Slim 7
twistedFantasy / aws
The goal of this repository is to provide good and clear examples of Amazon CLI commands together with Amazon CDK to easily create any AWS services and resources
amazon-aws amazon-cdk amazon-web-services aws-codedeploy aws-ec2 aws-elasticsearch aws-iam aws-load-balancer aws-parameter-store aws-rds aws-redshift aws-route53 aws-s3 aws-security-group aws-sqs aws-systemmanager aws-vpc python python3
Language:Python 7
aws-samples / zero-etl-architecture-patterns
Zero-ETL integrations - Enable near real-time analytics on petabytes of transactional data
aws-aurora-database aws-redshift zero-etl
Language:Python 6
DivineSamOfficial / SmartCityProject
Smart City Realtime Data Engineering Project
aws aws-athena aws-ec2 aws-quicksight aws-redshift aws-s3 kafka pyspark python spark-streaming aws-glue aws-glue-crawler aws-glue-data-catalog
Language:Python 5
polo2444172276 / Udacity-Data-Engineering-Nanodegree
Completed Udacity's data engineering nano degree. Went through a series of exercises and projects to learn and practice the trendy big data management tools.
airflow aws aws-ec2 aws-redshift aws-s3 cassandra data-engineering data-lake data-warehouse etl-pipeline postgresql spark
Language:PLpgSQL 5
FedericoSerini / DEND-Project-3-Data-Warehouse-AWS
Project 3 - Data Engineering Nanodegree
aws aws-redshift udacity-nanodegree data-engineering aws-s3
Language:Python 4
eduardofb / redshift-create-manifest
Redshift script to create a MANIFEST file recursively
redshift redshift-manifest aws-redshift
Language:Python 3
FedericoSerini / DEND-Project-5-Data-Pipelines
Project 5 - Data Engineering Nanodegree
aws aws-s3 aws-redshift udacity-nanodegree data-engineering data-pipelines apache-airflow
Language:Python 3
Huyen-P / DE_DWH_AWS_S3_RedShift
building etl pipelines to migrate music json data/ metadata files (semi-structured data) into a relational database stored in AWS Redshift cluster
datawarehouse vpc python cloudshell cmdline aws-redshift sql
Language:Python 3
lregnier / slick-amazon-redshift
A quick example of how to load data from Amazon S3 into Amazon Redshift using Redshift's COPY command through Slick
scala slick amazon aws aws-redshift redsfhit copy aws-s3 s3
Language:Scala 3
mikecerton / The-Retail-ELT-Pipeline-End-To-End
This project designs and implements an ETL pipeline using Apache Airflow (Docker Compose) to ingest, process, and store retail data. AWS S3 acts as the data lake, AWS Redshift as the data warehouse, and Looker Studio for visualization. [Data Engineer]
aws-redshift aws-s3 data-engineer etl-pipeline looker-studio apache-airflow
Language:Python 3
BhawnaMehbubani / Kafka-Spark-Redshift-Streaming-Data-Ingestion-Project
This project is a real-time data pipeline designed for ingesting, processing, and storing telecom call records. It integrates Apache Kafka, Apache Spark Streaming, and AWS Redshift to handle large volumes of streaming data in near real-time. The pipeline is containerized with Docker Compose, enabling easy deployment, scalability, and modularity.
apache-kafka apache-spark aws-redshift docker spark-streaming
Language:Python 2
DimaKuriptya / RedditETL
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
airflow aws-athena aws-glue aws-redshift aws-s3 celery docker pandas postgresql python redis
Language:Python 2
DivineSamOfficial / Banking-Data-Warehouse-Pipeline
Banking Data Warehouse Pipeline
aws aws-glue aws-redshift data-engineering-pipeline datawarehousing dbt python
Language:Python 2
Hanan-Nawaz / FlightTragedyAnalysis
Flight Tragedy Analysis is a comprehensive data analysis project focused on examining aviation accidents and incidents from 1905 to 2009. This project provides users with valuable insights into historical plane crashes and their associated data.
airplane-crashes aws aws-redshift aws-s3 data-engineering etl kaggle power-bi python sql psycopg2 postgresql
Language:Jupyter Notebook 2

aws-redshift

alanchn31 / Data-Engineering-Projects

tokern / piicatcher

aws / amazon-redshift-python-driver

alanchn31 / Movalytics-Data-Warehouse

Wittline / uber-expenses-tracking

shravan-kuchkula / udacity-data-eng-proj-1

aws-solutions / clickstream-analytics-on-aws

KentHsu / Udacity-Data-Engineering-Nanodgree

jackmleitch / StravaDataPipline

AnMol12499 / Reddit-Analytics-Integration-Platform

heroku-examples / analytics-with-kafka-redshift-metabase

ismaildawoodjee / aws-data-pipeline

moritzkoerber / covid-19-data-engineering-pipeline

LoveNui / DataEngineering-Capstone-Project

tmheo / spring-data-jpa-redshift-sample

kishlayjeet / Zomato-Twitter-Sentiment-Analysis-Data-Pipeline

AWS-Big-Data-Projects / Analysing-Census-Data-using-aws

vsouza / spark-kinesis-redshift

kishaningithub / rdapp

lenguyenthedat / aws-redshift-to-rds

essraahmed / Data-Warehouse-With-Redshift

taise / Spectrometer

twistedFantasy / aws

aws-samples / zero-etl-architecture-patterns

DivineSamOfficial / SmartCityProject

polo2444172276 / Udacity-Data-Engineering-Nanodegree

FedericoSerini / DEND-Project-3-Data-Warehouse-AWS

eduardofb / redshift-create-manifest

FedericoSerini / DEND-Project-5-Data-Pipelines

Huyen-P / DE_DWH_AWS_S3_RedShift

lregnier / slick-amazon-redshift

mikecerton / The-Retail-ELT-Pipeline-End-To-End

BhawnaMehbubani / Kafka-Spark-Redshift-Streaming-Data-Ingestion-Project

DimaKuriptya / RedditETL

DivineSamOfficial / Banking-Data-Warehouse-Pipeline

Hanan-Nawaz / FlightTragedyAnalysis