awsglue

There are 0 repository under awsglue topic.

prestodb / prestorials
Tutorials and examples of how to deploy Presto and connect it to different data sources
aws awsglue data datalake docker example glue lakehouse mongodb presto presto-connector prestodb prestosql sql tutorial walkthrough
15
YouTube-Trending-video-analysis-ETL-using-AWS-Services
Akanksha-tetwar / YouTube-Trending-video-analysis-ETL-using-AWS-Services
In this project I have used the Trending YouTube Video Statistics data from Kaggle to analyze and prepare it for usage.
aws aws-athena aws-glue-crawler aws-s3 awsglue awslambda python quicksight
1
TanishkaMarrott / Real-Time-Streaming-Analytics-with-Kinesis-Flink-and-OpenSearch
This project focuses on real-time data streaming with Kinesis, using Flink for advanced processing and OpenSearch for analytics. This architecture has succinctly handled the complete lifecycle of data from ingestion to actionable insights, making it a comprehensive solution.
apacheflink awsglue awslambda cloudcomputing dataengineering kinesisdatastreams opensearch realtimeanalytics
Language:Java 1
Undisputed-jay / SpotifyAPI-Data-Engineering-Project
This projects uses ETL (Extract, Transform and Load) pipeline to extract data from Spotify using its API and loads the data to a data source(AWS Athena). The entire pipeline will be built using Amazon Web Services (AWS).
aws aws-athena aws-cloudformation aws-lambda aws-s3 awsglue python3 sql
Language:Jupyter Notebook 1
iqrabismii / Big-Data-Projects-
Projects on Big Data Using Pyspark and AWS
athena aws-s3 awsglue customer-products ecommerce tableau pyspark pyspark-mllib airflow
Language:Jupyter Notebook 0
nischaybikramthapa / dbt-athena-tpch
This project demonstrates how you can build downstream data pipeline using dbt in athena
aws-athena awsglue dbt dbt-core tpch dbt-athena
Language:Python 0
pawanyoda / create_glue_table_using_gitlab_cicd
Create Glue table using CI -CD
aws awscli awsglue docker-image gitlab-ci
0
shaundominic / Kafka-Streaming-Project
Leverages Apache Kafka to facilitate streaming real time data generated by Python to upload data into S3 using s3fs
apache-kafka aws awsglue ec2 python s3
Language:Python 0
riship1095 / YouTube-ETL
Transformed YouTube’s raw JSON data to parquet & loaded it in an S3 bucket, used Glue Data Catalog for storing metadata & Athena to query the cleaned data. Developed an ETL process using a Lambda job that would be triggered when raw data is loaded into an S3 bucket, processed, and stored for analytical purposes in an S3 bucket.
aws aws-athena aws-lambda aws-s3 awsglue data-engineering etl
Language:Python
shreyask1406 / Financial-Market-AWS-Data-Pipeline
AWS Data pipeline
athena aws aws-s3 awsglue tableau
Big-Data
vanibhat02 / Big-Data
Big data and Cloud Deployment
athena aws aws-cloudformation aws-s3 awscli awsglue big-data etl iam-authentication sagemaker-deployment tableau
Language:Jupyter Notebook
VivekaAryan / Reddit-Data-Pipeline
This project offers a robust data pipeline solution designed to efficiently extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. Leveraging a blend of industry-standard tools and services, the pipeline ensures seamless data processing and integration.
airflow athena aws aws-s3 awsglue celery postgresql reddit-api redshift-database
Language:Python

awsglue

prestodb / prestorials

Akanksha-tetwar / YouTube-Trending-video-analysis-ETL-using-AWS-Services

TanishkaMarrott / Real-Time-Streaming-Analytics-with-Kinesis-Flink-and-OpenSearch

Undisputed-jay / SpotifyAPI-Data-Engineering-Project

iqrabismii / Big-Data-Projects-

nischaybikramthapa / dbt-athena-tpch

pawanyoda / create_glue_table_using_gitlab_cicd

shaundominic / Kafka-Streaming-Project

riship1095 / YouTube-ETL

shreyask1406 / Financial-Market-AWS-Data-Pipeline

vanibhat02 / Big-Data

VivekaAryan / Reddit-Data-Pipeline