capeprivacy / cape-python

Collaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NumericPerturbation - Masking different values for same source data

prabhus163 opened this issue · comments

Describe the bug
Is it possible to get same masked value for a an integer for its all occurrences
In below sample, i am defining a column as age which has all values same as 14, but after masking its not giving same value.

To Reproduce
import pandas as pd
from cape_privacy.pandas import dtypes
from cape_privacy.pandas.transformations import NumericPerturbation
df = pd.DataFrame({"age": [14,14,14,14]})
perturb_age = NumericPerturbation(dtype=dtypes.Integer, min=-10, max=10, seed=111)
df["age"] = perturb_age(df["age"])

Expected behavior
I am trying mask an integer with same value for all its occurrences