Joe Harris's starred repositories
gpt-engineer
Specify what you want it to build, the AI asks for clarification, and then builds it.
nixtla
TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection. Generative pretrained transformer for time series trained on over 100B data points. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code ๐.
awesome-duckdb
๐ฆ A curated list of awesome DuckDB resources
spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
gd2md-html
Convert a Google Doc to Markdown or HTML. This Docs add-on converts a Google Doc to simple Markdown and/or HTML.
dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
aws-arch-backoff-simulator
Simulator for AWS architecture blog (http://www.awsarchitectureblog.com/ ) about jitter and backoff.
data-brokers
A simply complicated guide to removing your info from data brokers
spark-rapids-examples
A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.
sql-logic-test
sql-logic-test
aws-docker-toolkit
A lightweight dockerized version of the AWS CLI
TPC-H-Skew
TPC-H benchmark with skew factor enabled