There are 10 repositories under master-data-management topic.
A Python script for generating duplicate data to test the performance of record linkage and master data management systems.
A Python package designed to allow health, biomedical and other researchers to clean (standardise) and deduplicate or link data sets of all sizes faster, with less effort and with improved quality.
CluedIn toolkit: Python module, Postman collection, samples, and how-tos.
Toolkit for automating the configuration of IBM Master Data Management Collaborative Edition environments using a model-driven, revision-controllable approach.
The goal was to maintain a ‘single version of truth’ for associated entities across the entire organization’s data sources. The RecordLinkage package is integrated with a wrapper recursive data-pipeline for de-duplicating of records and generating a master set. Similarity between two textual strings determines if they are a probabilistic match.