RTIInternational / teehr-spark-iceberg

A repository to hold code related to testing Spark and Iceberg for use in the TEEHR system.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repo contains files related to our testing for spark and iceberg for the TEEHR evaluation system.

To work with Spark we created a Kubernetes cluster to run the Spark Executors and Drivers as well as host the Iceberg rest server and Minio bucket for these tests.  The steps taken to create the cluster are shown in `eksctl/README.md`.  The notebooks for the test are in `notebooks`.

About

A repository to hold code related to testing Spark and Iceberg for use in the TEEHR system.


Languages

Language:Jupyter Notebook 86.9%Language:Python 9.2%Language:Shell 2.7%Language:Dockerfile 1.3%