sathya-reddy-m's repositories
polaris
The interoperable, open source catalog for Apache Iceberg
www-project-top-10-for-large-language-model-applications
OWASP Foundation Web Respository
unitycatalog
Open, Multi-modal Catalog for Data & AI
api-guidelines
adidas group API design guidelines
pyspark-style-guide
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
aws-saas-factory-eks-reference-architecture
This repository provides a reference architecture for building an end to end SaaS solution using Amazon Elastic Kubernetes Service (EKS)
lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
delta-examples
Delta Lake examples
metricflow
MetricFlow allows you to define, build, and maintain metrics in code.
architecture-center
Open Source documentation for the Azure Architecture Center on Microsoft Docs
ctakes
Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.
openhouse
Open Control Plane for Tables in Data Lakehouse
awesome-database-design
:zap: A collection of resources and tutorials to design a better database schema.
seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
inlong
Apache InLong - a one-stop, full-scenario integration framework for massive data
actionlint
:octocat: Static checker for GitHub Actions workflow files
AzureTRE
An accelerator to help organizations build Trusted Research Environments on Azure.
data-engineering-wiki
The best place to learn data engineering. Built and maintained by the data engineering community.
awesome-azure-architecture
AWESOME-Azure-Architecture - https://aka.ms/AwesomeAzureArchitecture
OpenLineage
An Open Standard for lineage metadata collection
OpenMetadata
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
datacompy
Pandas and Spark DataFrame comparison for humans and more!
terraform-databricks-sra
The Security Reference Architecture (SRA) implements typical security features as Terraform Templates that are deployed by most high-security organizations, and enforces controls for the largest risks that customers ask about most often.
ddedocs
Data Developer & Engineer Documents and Hands-On
terraform-databricks-lakehouse-blueprints
Set of Terraform automation templates and quickstart demos to jumpstart the design of a Lakehouse on Databricks. This project has incorporated best practices across the industries we work with to deliver composable modules to build a workspace to comply with the highest platform security and governance standards.
quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
spark
Apache Spark - A unified analytics engine for large-scale data processing
brickflow
Pythonic Programming Framework to orchestrate jobs in Databricks Workflow