There are 23 repositories under data-governance topic.
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Collect, aggregate, and visualize a data ecosystem's metadata
SQL Lineage Analysis Tool powered by Python
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Metrics Observability & Troubleshooting
Generate and Visualize Data Lineage from query history
Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
This Apache Atlas is built from the latest release source tarball and patched to be run in a Docker container.
Open Source Data Quality Monitoring.
ODD Specification is a universal open standard for collecting metadata.
Egeria's Guidance on Governance as well as large media files such as presentations and movies
HiveMQ Edge is an MQTT gateway that enables interoperability between OT devices and IT systems. It translates diverse protocols into MQTT for streamlined communication and helps organize data into a unified namespace, making managing and streaming data across your infrastructure easier.
Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
POC to demonstrate how to alter incoming/outgoing records in Kafka. It's a toy, don't use it in production.
Data Quality Gate based on AWS
Data catalog for everything in your company
Open-source metadata collector based on ODD Specification
Data-Export支持将链上数据导出到MySQL、ES等便于进行大数据处理的存储介质中,解决区块链数据复杂查询、分析、可视化和处理的问题。
Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
Guide to data platforms and tools
Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。
Data-Reconcile是一款基于区块链的对账组件,提供基于区块链智能合约账本的通用化数据对账解决方案,并提供了一套可动态扩展的对账框架,支持定制化开发。
A data lineage tool detects table dependencies from rendered SQL statements.
an open source dataworks platform
A Python package to centralize some Google Cloud Data Catalog scripts, this repo contains commands like bulk CSV operations that help leverage Data Catalog features.
Configuration and schema sync for Metabase from Python
Python package to manage Google Cloud Data Catalog tags, loading metadata from external sources -- currently supports the CSV file format
Load dbt artifacts uploaded to GCS to BigQuery in order to track historical dbt results