There are 35 repositories under data-catalog topic.
The Metadata Platform for your Data Stack
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
📙 Awesome Data Catalogs and Observability Platforms.
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
Work with your web service, database, and streaming schemas in a single format.
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.
Reference Architectures for Datalakes on AWS
Sample code with integration between Data Catalog and RDBMS data sources.
Data catalog for everything in your company
Tag Engine automates the process of creating, updating, deleting, and populating metadata in bulk with Google Cloud's Data Catalog. Tag Engine is licensed under the Apache 2 license terms. Please make sure to read, understand and agree to the terms of the LICENSE and CONTRIBUTING files before proceeding.
The documentation repository is part of the Corporate Linked Data Catalog - short: COLID - application.
Open-source metadata collector based on ODD Specification
National Data Archive (NADA) is an open source data cataloging system that serves as a portal for researchers to browse, search, compare, apply for access, and download relevant census or survey information. It was originally developed to support the establishment of national survey data archives.
Sample code with integration between Data Catalog and BI data sources.
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
Registry of data portals, catalogs, data repositories including data catalogs dataset and catalog description standard
A data lineage tool detects table dependencies from rendered SQL statements.
A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable front end that's just HTML.
A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀
🌀 The JS data presentation framework. For a single dataset to a full catalog.
Documentation repository for the Egeria project.
Polar Earth Observation Database of satellite sensors
Update a Google Data Catalog tag with dbt Cloud run metadata
SciCat open data catalogue web client
a collection of remote climate data accessed via intake cached to disk
articat: data artifact catalog
Data Catalogs Made Easy