There are 30 repositories under data-governance topic.
The Metadata Platform for your Data and AI Stack
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Collect, aggregate, and visualize a data ecosystem's metadata
SQL Lineage Analysis Tool powered by Python
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
Generate and Visualize Data Lineage from query history
Metrics Observability & Troubleshooting
Main repo including core data model, data marts, data quality tests, and terminology sets.
AtroCore is an enterprise-ready, highly configurable, and scalable open-source Data Management and System Integration Platform. It can be used for Master Data Management (MDM), Product Information Management (PIM), Business Process Management (BPM), and much more.
Open Source Data Quality Monitoring.
Pebblo enables developers to safely load data and promote their Gen AI app to deployment
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
HiveMQ Edge is an MQTT gateway that enables interoperability between OT devices and IT systems. It translates diverse protocols into MQTT for streamlined communication and helps organize data into a unified namespace, making managing and streaming data across your infrastructure easier.
ODD Specification is a universal open standard for collecting metadata.
This Apache Atlas is built from the latest release source tarball and patched to be run in a Docker container.
Egeria's Guidance on Governance as well as large media files such as presentations and movies
An opinionated end-to-end data governance implementation.
Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
A demo of Bufstream, a drop-in replacement for Apache Kafka that's 8x less expensive to operate and brings broker-side schema awareness to Kafka
三足乌数据中台融合数据接入、数据开发、数据仓库、数据治理、数据资产、数据服务、BI可视化、系统管理等功能模块为一体。打通数据壁垒,解决数据孤岛问题,助力企业数字化转型。
System Design, Solution Architecture, Data Systems Practice
Data Quality Gate based on AWS
Data catalog for everything in your company
Data-Export支持将链上数据导出到MySQL、ES等便于进行大数据处理的存储介质中,解决区块链数据复杂查询、分析、可视化和处理的问题。
Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow
Open-source metadata collector based on ODD Specification
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
Guide to data platforms and tools
Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。
A data lineage tool detects table dependencies from rendered SQL statements.
Data-Reconcile是一款基于区块链的对账组件,提供基于区块链智能合约账本的通用化数据对账解决方案,并提供了一套可动态扩展的对账框架,支持定制化开发。