Denis A's starred repositories
system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
cloudquery
The open source high performance ELT framework powered by Apache Arrow
aws-sdk-pandas
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
libarchive
Multi-format archive and compression library
incubator-livy
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
data-on-eks
DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
action-validator
Tool to validate GitHub Action and Workflow YAML files
dbt_metrics
Macros for calculating metrics
djl-serving
A universal scalable machine learning model deployment solution
pypi-duck-flow
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
add-determinism
Build postprocessor to reset metadata fields for build reproducibility
pace
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
pymetastore
A Python Client for Hive Metastore