Alan Drummond's starred repositories
azure-blob-to-s3
Batch copy files from Azure Blob Storage to Amazon S3
data-engineer-roadmap
Roadmap to becoming a data engineer in 2021
amazon-deequ-glue
Automated data quality suggestions and analysis with Deequ on AWS Glue
developer-roadmap
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
amazon-redshift-utils
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
amazon-kinesis-producer
Amazon Kinesis Producer Library
public-apis
A collective list of free APIs
spark-sql-internals
The Internals of Spark SQL
aws-sdk-pandas
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
aws-research-workshops
This repo provides a managed SageMaker jupyter notebook with a number of notebooks for hands on workshops in data lakes, AI/ML, Batch, IoT, and Genomics.
kafka-beginner-learnings
https://github.com/simplesteph/kafka-beginners-course
great_expectations
Always know what to expect from your data.
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
awesome-nifi
A list of useful Apache NiFi resources, processor bundles and tools
aws-sam-cli
CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM