There are 12 repositories under data-anonymization topic.
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Secure Vault for Customer PII/PHI/PCI/KYC Records
ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.
Python Data Anonymization & Masking Library For Data Science Tasks
Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.
A PHP library to back up, restore and anonymize databases
HideDroid is an Android app that allows the per-app anonymization of collected personal data according to a privacy level chosen by the user.
This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Deidentify people's names and gender specific pronouns
ANJANA is a Python library for anonymizing sensitive data
Maskwise detects, redacts, masks, and anonymizes sensitive data across text, images, and structured data in training datasets for LLM systems. Powered by Microsoft Presidio
Simple yet powerful tool for identifying and anonymizing personal information in various formats.
Library for identification, anonymization and de-anonymization of PII data
Cinnamon is a modular application designed to offer robust functionalities for data anonymization, synthetization, and evaluation.
🎓🔒 Creating, Analyzing and Testing Differential Privacy Protocols, aiming in Data Protection and Anonymization.
Differentially Private Synthetic Data Generation [DP-SDG] - Experimental Setups & Knowledge Base - WORK IN PROGRESS
Anonymize sensitive data in your datasets.
Anonymizer tool for datasets such CSV files
This script generates various types of fake data, such as names, addresses, phone numbers, coordinates, and more, using the Faker library. Users can select the data type and the quantity to generate. The generated data is saved to a JSON file
Generate anonymized test dataset from production data and configurable anonymization sequences. Execute base to base (vendor agnostic) export and import
Implementation of An Efficient Clustering Method for k-Anonymization in Python 2.7
DataAnonymizer is an open-source personal data anonymization tool designed for GDPR compliancy
Induction to anonymization of data
Impacts of data anonymization on model prediction for diabetes
Data anonymization using Angular 2+
A free data masking and/or anonymizer library
M.Tech final year project to create a data anonymization tool.
GenAI-SQL is a modular, extensible suite of AI-powered tools for automating SQL code improvement, documentation, and validation. Built for developers, analysts, and data engineers, it leverages Azure OpenAI (GPT-4o) to analyze, refactor, comment, explain, test, and audit SQL — all within a secure, asynchronous, and HIPAA-compliant framework.
BeeGen is an intelligent command-line tool designed to assist developers with everyday tasks, leveraging the power of generative AI.
anonymaCy is a spaCy extension for anonymizing PII using rule-based recognizers, context-aware processing, conflict resolution and customizable anonymization.