Paul Bhorjee's repositories
aws-sam-swagger-apigateway-lambda-starter
Swagger, API Gateway, Lambda integration using AWS Serverless Application Model (AWS SAM)
Databricks-Certified-Data-Engineer-Associate
The resources of the preparation course for Databricks Data Engineer Associate certification exam
pandas_exercises
Practice your pandas skills!
tf-python
A simple hello world Python application.
pyspark-examples
Pyspark RDD, DataFrame and Dataset Examples in Python language
Databricks-Certified-Data-Engineer-Professional
The resources of the preparation course for Databricks Data Engineer Professional certification exam
whylogs
The open standard for data logging
data-observability-in-practice
Source code for the MC technical blog post "Data Observability in Practice Using SQL"
bahir
Mirror of Apache Bahir
Remove-DuplicateItems
Script to remove duplicate items from Exchange mailboxes.
modern-data-lake-storage-layers
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
hudi
Upserts, Deletes And Incremental Processing on Big Data.
Boop
A scriptable scratchpad for developers. In slow yet steady progress.
nflow-generator
NetFlow Generator for Testing Flow Collection Apps
airflow-in-eks-docs
Guide on how to setup eks, fluxcd and airflow in aws
argo-workflows
Workflow engine for Kubernetes
docker-images
Docker images for Debezium. Please log issues in our JIRA at https://issues.redhat.com/projects/DBZ/summary
kafka-connect-msk-demo
For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR
docker-airflow
Docker Apache Airflow
Data-Science-Resources
Great resources from finding data sets to finding jobs!
ds-precourse-histograms
This repository holds the content for the plotting challenges in the histograms lesson.