pquinteroc's repositories
awesome-gcp-certifications
A curated list of resources for learning about Google Cloud Platform certifications and how to prepare for it.
aws-glue-developer-guide
The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by making proposed changes & submitting a pull request.
docker-images
Official source for Docker configurations, images, and examples of Dockerfiles for Oracle products and projects
git-secrets
Prevents you from committing secrets and credentials into git repositories
kafka-stack-docker-compose
docker compose files to create a fully working kafka stack
kubernetes-kafka
Kafka cluster as Kubernetes StatefulSet, plain manifests and config
spark-snowflake
Snowflake Data Source for Apache Spark.
airflow-pagerduty-plugin
An Airflow operator for triggering PagerDuty incidents.
amazon-redshift-utils
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
confluent-kafka-python
Confluent's Apache Kafka Python client
Data-Science--Cheat-Sheet
Cheat Sheets
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
divolte-kafka-druid-superset
A proof of concept using Divolte, Kafka, Druid and Superset
docker-druid
Druid Docker
docker-kafka
Kafka (and Zookeeper) in Docker
geoscan
Geospatial clustering at massive scale
googleads-python-lib
The Python client library for Google's Ads APIs
kafka-tutorials
Kafka Tutorials microsite
md2googleslides
Generate Google Slides from markdown
memray
Memray is a memory profiler for Python
python-patterns
A collection of design patterns/idioms in Python
rubix
Cache File System optimized for columnar formats and object stores
spark-redshift
Redshift data source for Apache Spark
spring-cloud-dataflow
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes