There are 1 repository under text-anonymization topic.
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Deidentify people's names and gender specific pronouns
Clean your Text for Statistical ML and Language Model
This repository contains the code and data for the text re-identification attack presented in B. Manzanares-Salor, D. Sánchez, P. Lison, Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack, Data Mining and Knowledge Discovery, 2024.
Simple project on html anonymization
This repository contains the code and data for the text anonymization enhancement method presented in B. Manzanares-Salor, D. Sánchez, Enhancing text anonymization via re-identification risk-based explainability, Submitted, 2024.