There are 5 repositories under lineage topic.
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
SQL Lineage Analysis Tool powered by Python
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV
基于 antlr4 的多种数据库SQL解析器,获取SQL中元数据,可用于数据平台产品中的多个场景:ddl语句提取元数据、sql 权限校验、表级血缘、sql语法校验等场景。支持spark、flink、gauss、starrocks、Oracle、MYSQL、Postgresql,sqlserver,、db2等
Genealogy is a free and open-source family tree PHP application to record family members and their relationships, build with LARAVEL 12.
GEDKeeper - program for work with personal genealogical database
STREAM: Single-cell Trajectories Reconstruction, Exploration And Mapping of single-cell data
An open-source, vendor-neutral data context service.
Make dbt docs and Apache Superset talk to one another
🐞 Convert NCBI taxonomy dump into lineages
A connector to ingest Azure Databricks lineage into Microsoft Purview
Compass is an enterprise data catalog that makes it easy to find, understand, and govern data.
Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data types and format characters) using Java.
Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization system
Solution Accelerator to help build Purview custom connectors
Open-source metadata collector based on ODD Specification
Awesome Privacy - A curated list of services and alternatives that respect your privacy because PRIVACY MATTERS. With repository stars⭐ and forks🍴
CMF library helps to collect and store information associated with ML pipelines. It tracks the lineages for artifacts and executions of distributed AI pipelines. It provides API's to record and query the metadata associated with ML pipelines. The framework adopts a data first approach and all artifacts recorded in the framework are versioned and identified by the content hash.
A data lineage tool detects table dependencies from rendered SQL statements.
NCBI taxonomic identifier (taxid) changelog, including taxids deletion, new adding, merge, reuse, and rank/name changes.
Generate beautiful documentation for your data pipelines in markdown format
modifying / extending the Android build process