Dataminded's repositories
lighthouse
Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines and apply best practices.
blog-tpcds-dbt-duckdb
This repository contains the tpcds queries together with the code required to run this benchmark for dbt and duckdb
conveyor-samples
Samples on how to use Conveyor.
iceberg-ingestion
Public repository containing sample code for how to improve ETL ingestion processes with Apache Iceberg
blog-platform-quack-quack-ka-ching
The duck escapes with the credits.
homebrew-conveyor-formulas
Brew tap repository for Conveyor
conveyor-templates
Cookiecutter templates used by Conveyor.
aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
conveyor-dbt-devenv
Repository to show the use of dev environments in the context of dbt
dbt-conveyor-snowflake
The Conveyor Snowflake adapter is a thin shell around the Snowflake adapter to allow authenticating users in Conveyor IDE's with Snowflake to run DBT projects
dbt-playground
Try out dbt in a Gitpod environment in one click, with a Postgres database pre-configured
ecr-mirror
Mirror public repositories to internal ECR repos
eks-spark-benchmark
Performance optimization for Spark running on Kubernetes
git-credential-oauth
A Git credential helper that securely authenticates to GitHub, GitLab and BitBucket using OAuth.
terraform-aws-eks
Terraform module to create an Elastic Kubernetes (EKS) cluster and associated resources 🇺🇦