easonlai / pii-data-scrubber

This is demo repo to demonstrate how to leverage Azure Text Analytics to perform Personally identifiable information (PII) data scrubbing by Python (Jupyter Notebook). This is important part of data wrangling/data cleaning.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PII Data Scrubber

  • Version 1.2 - Added Price information (by $ amount, including in sentence) scrubbing.
  • Version 1.1 - Added Hong Kong Identity Card (HKID) number (including in sentence) scrubbing.

This is demo repo to demonstrate how to leverage Azure Text Analytics to perform Personally identifiable information (PII) data scrubbing by Python (Jupyter Notebook). This is important part of data wrangling/data cleaning.

Sample PII data (data/pii-sample-data.csv) is contain dummy variations of Visa Card number, Master Card number, American Express Card number, Phone number, Name, Address, Email Address, Hong Kong Identity Card (HKID) number (including in sentence) & Price information (by $ amount, including in sentence).

alt text

About

This is demo repo to demonstrate how to leverage Azure Text Analytics to perform Personally identifiable information (PII) data scrubbing by Python (Jupyter Notebook). This is important part of data wrangling/data cleaning.


Languages

Language:Jupyter Notebook 100.0%