Tiago Didoné's repositories
0722-bootcamp-sql
Bootcamp SQL - Data engineer
spark-glue
Spark env to Glue development
FirehoseFunction
Firehouse lambda function
spark-carbon
Spark with carbon data for Huawei cloud
aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
aws-glue-libs
AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
sam-pyspark
Serverless PySpark
aws-lambda-container-image-converter
The AWS Lambda container image converter tool (img2lambda) repackages container images (such as Docker images) into AWS Lambda layers, and publishes them as new layer versions.
carbondata
Mirror of Apache CarbonData
cdktf-remote-template-python-poetry
A terraform-cdk CLI template for Python projects using Poetry for dependency management
delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
fastavro
Fast Avro for Python
hive
Apache Hive
mo-sql-parsing
Let's make a SQL parser so we can provide a familiar interface to non-sql datastores!
ply
Python Lex-Yacc
spark
Apache Spark - A unified analytics engine for large-scale data processing