capeprivacy / cape-dataframes

Privacy transformations on Spark and Pandas dataframes backed by a simple policy language.

Home Page:https://docs.capeprivacy.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Calling DatePerturbation alters the original pd.Series

shukkkur opened this issue · comments

Describe the bug
Calling the class DatePerturbation

To Reproduce

>>> def load_dataset(sess=None):
    dataset = pd.DataFrame({
        "name": ["alice", "bob"],
        "age": [34, 55],
        "birthdate": [pd.Timestamp(1985, 2, 23), pd.Timestamp(1963, 5, 10)],
        "salary": [59234.32, 49324.53],
        "ssn": ["343554334", "656564664"],
    })

>>> df = load_dataset()
>>> perturb_date = DatePerturbation(frequency=("YEAR", "MONTH", "DAY"), min=(-10, -5, -5), max=(10, 5, 5))
>>> perturb_numric(df["age"])

Expected behavior
When calling the perturb_numric without assignment changes the original pd.Series

Screenshots
image