There are 9 repositories under aws-glue topic.
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
(Unofficial) curated list of awesome workshops found around in the internet. As we all have been there, finding that workshop that you have just attended shouldn't be hard. The idea is to provide an easy central repository, in a collaborative way.
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Data Lake as Code, featuring ChEMBL and OpenTargets
Glue scripts for converting AWS Service Logs for use in Athena
Open innovation with 60 minute cloud experiments on AWS
Automated data quality suggestions and analysis with Deequ on AWS Glue
Streamlit EDA Dashboard Powered by AWS Cloud
Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level tutorial.
Bring your own data Labs: Build a serverless data pipeline based on your own data
Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects
🌉 Reference implementation for granting cross-account AWS Glue Data Catalog access from Amazon Athena
Example of AWS Glue Jobs and workflow deployment with terraform in monorepo style. Code here supports the miniseries of articles about AWS Glue and python.
Build and Deploy A Serverless Data Pipeline on AWS
Automate the daily partitioning of your CloudTrail bucket in Athena
Use the AWS Glue Schema Registry in Python projects.
AWS Glue tutorial for data developers.
Terraform module which creates Glue resources on AWS
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
🐋 Docker image for AWS Glue Spark/Python
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS
Terraform modules for provisioning and managing AWS Glue resources
This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.
Example of how to set SBT up for local development of AWS Glue Scripts
Proof of Value Terraform Scripts to utilize Amazon Web Services (AWS) Security, Identity & Compliance Services to Support your AWS Account Security Posture.
DevOps에 대한 개념 이해와 AWS 개발자 도구를 활용한 실습 및 연구
Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3
Discover how you can migrate from traditional deployments to serverless architectures with AWS
Create terraform module for AWS Glue
End-to-end data engineer project
A simple, practical, and affordable system for measuring head trauma within the sports environment, subject to the absence of trained medical personnel made using Amazon Kinesis Data Streams, Kinesis Data Analytics, Kinesis Data Firehose, and AWS Lambda