Nathan Mauro's starred repositories
aws-glue-schema-registry
AWS Glue Schema Registry Client library provides serializers / de-serializers for applications to integrate with AWS Glue Schema Registry Service. The library currently supports Avro, JSON and Protobuf data formats. See https://docs.aws.amazon.com/glue/latest/dg/schema-registry.html to get started.
aws-glue-etl-boilerplate
A complete example of an AWS Glue application that uses the Serverless Framework to deploy the infrastructure and DevContainers and/or Docker Compose to run the application locally with AWS Glue Libs, Spark, Jupyter Notebook, AWS CLI, among other tools. It provides jobs using Python Shell and PySpark.
glue-devcontainer-template
VSCode Dev Container template for AWS Glue jobs development
ActionWeaver
Make function calling with LLM easier
logseq-plugin-gpt3-openai
A plugin for GPT-3 AI assisted note taking in Logseq
gpt4-pdf-chatbot-langchain
GPT4 & LangChain Chatbot for large PDF docs
langchain-tutorials
Overview and tutorial of the LangChain Library
docker-intellij
Run IntelliJ IDEA inside a Docker container
hadooponwindows
Hadoop 2.7.1 on windows
awsglue-local-dev
A local development setup for AWS Glue, modified from https://github.com/big-data-europe/docker-spark
Automated_ETL_Finance_Data_Pipeline_with_AWS_Lambda_Spark_Transformation_Job_Python
This project covers the implementation of building an automated ETL data pipeline using Python and AWS Services with Spark transformation job for financial stocks trade transactions. The ETL Data Pipeline is automated using AWS Lambda Function with a Trigger defined. Whenever a new file is ingested into the AWS S3 Bucket; then the AWS Lambda Function gets triggered and will implement the further action to execute the AWS Glue Crawler ETL Spark Transformation Job. The Spark Transformation Job implemented using Python PySpark transforms the trade transactions data stored in the AWS S3 Bucket; further to filter a sub-set of trade transactions for which the total number of shares transacted are less than or equal to 100. Tools & Technologies: Python, Boto3, PySpark, SDK, AWS CLI, AWS Virtual Private Cloud (VPC), AWS VPC Endpoint, AWS S3, AWS Glue, AWS Glue Crawler, AWS Glue Jobs, AWS Athena, AWS Lambda, Spark
AWS_ETL_Pipeline_Project
A personal project to gain hands-on experience with AWS and how data flows in the cloud. I created a data pipeline using some of the most popular AWS tools: S3, Glue, Lambda, IAM, RedShift, EventBridge, and CloudWatch.
aws-lambda-powertools-examples
This repo holds example projects demoing different types of utilities provided by aws-lambda-powertools project for different runtimes
spark-sandbox
A playground for Spark jobs.
clean-architecture-dotnet
🕸 Yet Another .NET Clean Architecture, but for Microservices project. It uses Minimal Clean Architecture with DDD-lite, CQRS-lite, and just enough Cloud-native patterns apply on the simple eCommerce sample and run on Tye with Dapr extension 🍻
computer-science
:mortar_board: Path to a free self-taught education in Computer Science!
applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
deobfuscator
The real deal
mern_shopping_list
Shopping List built with MERN and Redux