samiriff / anonymizer

Python package that generates fake data. It internally makes use of the Faker package, and keeps track of the mapping between original and fake data.

Home Page:https://pypi.org/project/data-anonymizer-mapper/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Anonymizer

Anonymizer is a Python package that generates fake data for you. It internally makes use of the Faker package, and allows you to keep track of the mapping between your original and fake data. This will be especially useful when you are anonymizing data in pandas data frames.

   _____                                           .__
  /  _  \    ____    ____    ____  ___.__.  _____  |__|________  ____ _______
 /  /_\  \  /    \  /  _ \  /    \<   |  | /     \ |  |\___   /_/ __ \\_  __ \
/    |    \|   |  \(  <_> )|   |  \\___  ||  Y Y  \|  | /    / \  ___/ |  | \/
\____|__  /|___|  / \____/ |___|  // ____||__|_|  /|__|/_____ \ \___  >|__|
        \/      \/              \/ \/           \/           \/     \/

Basic Usage

Initialization

names = ['Kevin Bell', 'Ricky Sheppard', 'James Hill MD']
anonymizer = Anonymizer()

Get Anonymized Name

anonymizer.get_anonymized_name('Ghajinikanth Zuckerberg')
# 'Catherine Parker'

Get Original Name

anonymizer.get_original_name('Catherine Parker')
# 'Ghajinikanth Zuckerberg'

Get Anonymized Name for Same Name

anonymizer.get_anonymized_name('Ghajinikanth Zuckerberg') # First Call
# 'Catherine Parker'

anonymizer.get_anonymized_name('Ghajinikanth Zuckerberg') # Second Call
# 'Catherine Parker'

Fetch list of Anonymized Names

anonymizer.get_anonymized_names(names)
# ['Leslie Adams', 'Michelle Burke', 'Annette Maxwell']

Fetch list of Original Names

anonymizer.get_original_names(anonymizedNames)
# ['Kevin Bell', 'Ricky Sheppard', 'James Hill MD']

Get Anonymized Data for a different Faker Type

address_anonymizer = Anonymizer(faker_type=FakerType.ADDRESS)
address_anonymizer.get_anonymized_name('74437 Alexandra Well\nSouth Jade, CT 40282')
# 'USNS Hernandez\nFPO AA 32353'

Anonymize Names in a DateFrame column

df['Column']
# 0 None
# 1 None
# 2 Marcus Smith
# 3 Sherry Parsons
# 4 Marcus Smith
# Name: Author, dtype: object

anonymizer = Anonymizer(faker_type=FakerType.NAME)
df['Column'].apply(lambda s : anonymizer.get_anonymized_name(s) if s is not None else None)
# 0 None
# 1 None
# 2 Kelly Walker
# 3 Yolanda Hawkins
# 4 Kelly Walker
# Name: Author, dtype: object

Acknowledgements

About

Python package that generates fake data. It internally makes use of the Faker package, and keeps track of the mapping between original and fake data.

https://pypi.org/project/data-anonymizer-mapper/

License:MIT License


Languages

Language:Python 100.0%